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Synopsis 


Contributions to Information Theory with Special 
Reference to Subadditivity and Optimization 

A Thesis Submitted 

In Partial Fulfilment of the Requirements 
for the degree of 
Doctor of Philosophy 
By 

VEMPATY NARASIMHA MURTHY 
to the 

Department of Mathematics 
Indian Institute of Technology^ Kanpur 

In 1948/ Shannon had introduced a mathematical 
model of communication system and a function to quantify 
the informa-tion given by a random variable. His 
measure of information or entropy is given by 

n 

H(P) = - E In p. 

i = l ^ ^ 

n 

where P = P 2 / • • - # 0 < Pi < 1/ • ^ 

probability distribution of a discrete random variable X. 
H(P) is built based on some intutively necessa3q^ axioms 
of the concept of information. In addition to these 
axicms/ H(P) satisfies properties like s\ib additivity, 
concavity with respect to P, recursivity property and so on. 
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Any measure of information is expected to satisfy most of 
these properties# if not all. 

Renyi# 1961, Havrda-Charvat, 1967, Behara-Chawla, 1974, 
Aczel and Daroczy, 196 3, Kapur, 1967, 84,8 5,86, Van der 
L-ubbe et al, 1984, Sharma and Taneja, 1975, and Sharma and 
Mittal, 19 75 have all proposed new and generalized measures 
of entropy with one or more parameters. In Chapter 2 
section 1, we investigate which of these measures satisfy 
what properties and also give proofs in the cases where no 
attempts have been made so far. 

In section 2, we deal with measures of directed diver- 
gence and in section 3, we deal with measures of inaccuracy. 
In both the cases, the idea is the same : to find out exhaus- 
tively the properties of the measure in consideration. 

Several proofs of properties and counter-proofs of properties 
by examples have been given. At the end of each section, 
a comprehensive table is provided with measures vs properties, 
which is helpful in finding out at a glance if a particular 
measure satisfies a particular property or not. 

In 1977, A,B. El-Sayoed had introduced an inequality 
called the Independence Inequality for measures of entropy. 

It reads, 

H(P*Q) < H(PQ) 

or in words, the entropy of a joint probability scheme is 
maximum for the product distribution of the marginal 
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probability distributions. In Chapter 3 we introduce the 
Independence inequality for directed divergence measures 
and obtain various results for i) Renyi's measure of 
directed divergence/ ii) Havrda-Charvats* measure of 
directed divergence iii) Kapur's measure of directed 
divergence of order w and type 3 and iv) Sharma and 
Guptas' measure of directed divergence. We also establish 
the connection between subadditivity and independence 
inequality for directed divergence measures. 

In Chapter 4/ v/e consider the properties Of subaddi** 
tivity and super additivity for Havrda-Charvat and Renyi's ^ 
measures of entropy. Here we consider these measures as 
families of measures# rather than single measures. Hence 
we are able to clearly discuss the siib additivity and 
super additivity for some values of the parameters/# in both 
the cases, VJe also categories the set of probability 
distributions into two classes. For type I distaaLbutionS/# 
we establish a point ct'"*" where Renyi's measures of entropy 
change from being subaddi tive to super additive. We also 
show the usage of this critical point by building a 
measure of dependence between probability distributions. 

We do the same for Havrda-Charvab' s measure of entropy also. 

In Chapters 5 and 6 we consider some applications of 
the concept of directed divergence in Information Theory., 

The classical distance between two points, x,y in Euclidean 
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space is a function, d(.,,) satisfying 

i) d(x, y) > 0 V X and y 

ii) d(x,y) = 0 if and only if x = y 

iii) d(x, y) = d(y, x) V x and z { symmetry property) 

iv) d(x, z) < d(x, y) + d(y, z) ¥ x,y and z (Triangular 

inequality) . 

Now we c.3n consider D(PjQ), the directed divergence 
between two probability distributions P and Q, as the 
distance between two points P and Q in the space of probabi- 
lity distributions. By definition D(P^Q) satisfies i) and 
ii) above, but iii) and iv) are relaxed. Then with D(P;Q) 
as thq distance and probability distributions as points we 
can have a geometry of probability space. 

In Chapter 5 we consider two optimization problems. 

The first problem is motivated by an optimization problem 

solved by Kullback [S. Kullback, '‘Information Theory and 

Statistics'*, John Wiley, New York, 1959] in which he 

minimized D(P:R) s\±>ject to D(P;R) - D(P:Q) = 8 where 

n p. 

0 is a fixed constant, and D(P:R) = 2 P,- in , the 

i=l ^ % 

Kullback-Leiblor measure of directed divergence. He 
discussed the solution for lying between 0 and 1, but 
it is actually valid for a larger range of values of 0 , 

We find this range precisely. We consider the optimization 
problem in which we find the Gxttrcmum values of 0 , 
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The second optimization problem is one in which we 
find the maximum and minimum value of 


m 

9 = EX. D(p:Q.) 
j = l J J 

whore X^. > 0 V j and S X^ = 1 , are m 

given probcibility distributions. Later we solve both the 
optimiZvation problems using more generalized measures of 
directed divergence as distance functions. 

In Chapter 6 we consider seven optimization problems 
in probability space, which have very interesting Euclidean 
Geometric equivalents. These problems are solved by 
J.N, Kapur, in a recent paper, using Kullback-Leiblef ' s 
measure of directed divergence. We generalize the solutions 
of these problems by using 


d“’(PjQ) 


1 

ct-l 


n 


[ S 

i=l 


a i-a 


Pi C3. 


1] 


the Havrda-Charvat measure of directed divergencei. In cases 
where closed form solutions are very complicated, we study 
some interesting particular casos with the help of n;americal 
computations • 



Chapter 1 


Introduction 

1 *1 Entropy 

In 1948, C»E» Shannon had constructed a mathematical 
theory for communication systems* The rudiments of a commxinl- 
cation system are 1) the source from which messages originate 
11) the channel through messages are transmitted and 111) the 
receiver which receives the transmitted messages* 

The source and the receiver had been treated as randcm 
experiments with either continuous or discrete outcomes* The 
amount of uncertainty, which Shannon called entropy or Information 
about the outcome of a randon e 35 >erlment played a key role In 
Shannon's model for canmunlcatlon systems* 

Let X be a discrete random variable with P = (pj^cp 24 r 

n 

where 0< p^^^ <1, 1 =* and 2 p^ = 1 be Its probability 

1*1 

distribution. Then the following foxxr axlans were used by 
Shannon to build his measure of entropy, H(p) : 

I* H(l/n,l/n, **»,l/n) = f(n) Is a monotonlcally Increasing 
f\anctlon of n(n = 1,2,***)* 

II# f(n-m) = f(n) + fC*^)# (n,m =1,2,***)* 

III* H(pj_*P2, **#,pjj) = H(p^ +*..+ p^, Pn+i'**’*'*‘**Pn^ 



2 


+ ) h(- 


P1 


n 


t 


E Pi 
i=i ^ 


n 

S Pi 
i»X ^ 


) 


+ (Pj,rt+--+Pn> H(— 


■n+i 




n 


S P. 


IV. 


i=r+l 

(ill is called 'grouping axiom')# 
H(p,l~p) is a continuous function of p# 


n 

S Pi 
i=r+i ^ 


.) 


Shannon obtained 

n 

h(p) = - s p, log p^ (1*1) 

i=l ^ ^ 

as the measure of entropy satisfying the four axioms* In 1957 
Kinchin proved that any function satisfying I^II^III and IV 
above must be a constant multiple of Shannon's measure of 
entropy- 

In the three ensuing decades after Shannon's discoveries * 
Various scientists and engineers had obtained many more 
meas'ures of entropy by generalizing some of Shannon's axioms 
or by deleting some axioms altogether# But not all the 
measures proposed later satisfied the properties satisfied by 
Shannon's measure# 


We discussed in Chapter 2# section 1^ the properties 
possessed by Shannon's measure of entropy apart fron the four 
mentioned before# We also tabulated some of the more prominent 
measures of entropy and studied their properties# That is# we 
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checked for the generalized measures the properties possessed 
by Shannon's measure# The idea was not to contradict anybody 
(because many authors didn't publish results regarding the 
validity of these properties for their respective measures) but 
only to high light positive and negative aspects of these 
measures of entropy* After all/ there is no good in treating 
a measure/ which can also assume negative values* as a measure 
of entropy# 


1 *2 Directed pi verqence,. Independence Inequality and 
Op timi za tion 


S# Kullback and R#A» Leibler had introduced in 1951 a 
concept which they called mutual information# If P=(p^* 
and Q = were two discrete probability distributions* 

then Kullback—Leiblers'* measure of mutual information was given 
by 


D(P:Q) 


n 

2 Pi 

i=l ^ 



(l #2 ) 


where qi ^ 0 if Pi / 0* i = l/2/###in and 0#log 0 was treated 
as zero# 


d(P:Q) satisfied the following properties : 

(i) D(PtQ) >0 for all P and Q 

(ii) D(P:Q) = 0 iff P = Q 

(iii) D(PsQ) is a convex function of P and of Q# 


D(PiQ) can be compared to the classical distance 
function d(x/y) where x and y are points in any Euclidean space# 
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That is in a way we could treat D(P:Q) as the distance (or the 
directed divergence) of P from Q# Here the symmetry property 
and triangular inequality of d(x,y) were relaxed* So d(P;Q) 
might not be equal to D(Q:P)# In fact that was the reason why 
D(P:Q) was referred to as the directed divergence of P from Q* 

There were very many directed divergence measures in the 
literature of information theory# But not all of them satisfied 
all the properties that a directed divergence measure was 
expected to satisfy and indeed there were measures which violated 
even the non-negativity criteria# The property of convexity 
of the directed divergence measures was insisted upon because 
then the local minimum of D(P:Q) over P would be the 
global minimum# This property would come very handy in solving 
optimization problems in probability space which arise in many 
fields of research* 

In Chapter 2, section 2 we tabulated the properties which 
were essential for a directed divergence and checked them for 
measures of directed divergence^ which were also tabulated there* 

In Chapter 3 we proposed an inequality, the independence 
inequality for directed divergence measures* A#B#E1-Sayeed 
had introduced the independence inequality 

H(P* Q) < H(PQ) (1.3) 

for measures of entropy# The meaning of (1*3) in words was 
that the entropy of a joint probability scheme was maxim'um for 
the product distribution of the marginal distributions# 
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The indep®)(Jence inequality for directed divex“gence 
measures had been defined by us as 

D(P < D(PQ:RS) (l#4) 

where P-J^'Q and R were joint probability distributions with 
P/Q and R^S being the marginal distributions# The meaning of 
(1»4) in wobds was that the directed divergence of a joint 
probability distribution frcm another joint probability 
distribution was maximum when the products of their marginal 
probability distributions were considered* 

We considered four measures of directed divergence : Renyi's 
measure, Havrda-Charvats' , Kapur's measure of order tfc and type 0 
and Sharma and Tanejas' measures of directed divergence* We 
first established that either the Ranyi and the Ha vrda —Char vats' 
measures of directed divergence either both satisfied the 
ind^enden'Ce inequality or neither satisfied it* Then we 
provided examples of joint probability distributions for which 
the Independence inequality was not satisfied by any of the 
tout measures* SO/, we had established the fact tha t the 
independence inequality was as much a property of the 
distributions as it was of a directed divergence measure* Then 
We constructed some general probability distributions for Which 
all the four measures satisfied the independence Inequality* 

The Connection between the independence inequality ^md 
subadditiv-ity property of the directed divergence measures was 
also established-* If a directed divergence measure satisfied 
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the independence inequality and also was additive, then it 
satisfied the subadditivity law also* 

In Chapter 4 we considered two optimisation problems in 
probability space* In the first problan we obtained the 
maximum and minimum of 

e - D<P:Q) - D(PiR) (1.5) 

where Q and R were given distributions, D(PsQ) was i) Kullback- 
Leibler.i ii) Havrda-Charvat, iii) Kapur, iv) Perrari measure 
of directed divergence KUllback maximized D(P':Q) s. to 

e = D(P:Q) D(PiR) (1.6) 

where 6 was a real cons'tant, 0 £ © < 1* But we bad shown that 
his solution is valid for a bigger range of values of 0. In fact 
we had deSrived moti'^J'ation for probloin 1 from Kullback's problem* . 

In Problem 2 We find maximum and minimum of 
m 

g =t E K d(P':Q .) (1.7) 

j=l J ^ 

where Qj, j = 1,2>'. *.,m were given probability distributions and 
m 

0<X.<1> s Kj=i were Constants, ftere DXPfQ.) was 
J J J 

i) Kullback~Leiber> ii) Havrda'-Charvat and iii) Perrari measure 
of directed divergence. 

We had also provided alternate proofs for Kuli'back's 
resdlts • 



In Chapter 6 we had considered seven more optLimization 
problems with Havrda— Charvats' measure of directed divergence 
as the distance function* Kapur had earlier solved these 
problems using Kullback—Leiblers' measure of directed divergence 
as the distance fionction* We had also illustrated oixr results 
with some numerical computations* 

1 #3 Inaccuracy ; 

Let P = /P 2 # • * ) be the correct probability 

distribution of a stochastical escperiment# Let Q =* 
be the probability distribution which was asserted to be the 
right distribution for the experiment* It was really interesting 
to iknow how much inaccurate the assertion was* D*F, Kerridge 
in 1961, characterized a measure for inaccuracy l(P;Q)* He 
had based his measure on the following four axioms : 

I* I is continuous in both p^ and for all i = l,2,*-.,n* 

II* When n equally likely outcomes are stated to be equally 
likely then I is a monotonic increasing function of n» 

III* If a statement is broken down in to a number of 

subsidiary stataaents the inaccuracy of the original 
statement is a weighted siom of the inaccuracies of the 
subsidiary statements. 

IV. The inaccuracy of a statement is unchanged if two 

alternatives about which the same assertion is made are 
combined* 


Kerridge had established that all the four axioms are 
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satisfied if and only if 


n 

I(P:Q) =s -k E p. 

i=l ^ 


log q. 


where k is a multiplicative constant* 


( 1 . 8 ) 


Again there was a plethora of inaccuracy measures in the 
literature. Kerridge^s measure satisfied several useful 
properties apart from the four axioms mentioned earlier* However 
some of the measures obtained by other authors did not satisfy 
all of these properties* For example III, the recursivity (or 
the grouping) axiom is satisfied only by Kerridge's measure. 

There were some measures which also assumed negative values, 
but there was no plausible interpretation for negative inaccuracy. 
We had listed the more prominent measures of inaccuracy and also 
the properties which were desirable for an inaccuracy measure, 
in section 3, Chapter 2* There we had endeavoured to verify 
the properties for each measure listed* In the cases where a 
measure did not satisfy a particular property, a co\in ter— example 
had been given* 

1 .4 Subadditivity, Superadditivity and Measures of Dependence s 
Shannon's measure of entropy satisfied the subadditivity 

law, 

H(P*Q) < H(p) + H(Q) (1*9) 

and Renyi's measure of entropy of order a satisfied the super - 
additivity law. 
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H^(p*Q) > Hq^(p) + h^(Q) (l#io) 

for some values of a and some distribution P 

Renyi's measure of entropy or any other measure of 
entropy of order a and type 3 was not a single mea store# Infact 
it was a family of measures^ one measure corresponding to a 
value of a or Though we taiew that subadditivity property 
was not satisfied in general by Renyi's (or Havrda— Charvats'' 
measure or any other measure with one or more parameters)/ it 
was really interesting to know for what range of values of d/ 
and for what type of distributions it satisfied the subadditivity 
law. 

In Chapter 4 , we categorized the set of joint probability 
distributions into two classes. Type I and Type II » We 
established that for Type I distributions, there existed a 
point s«t. 1 < and was subadditive for all a > 
and suparadditive for 0 < a < However we were unable to 

rule out the possible existence of more than one but we 
proved that if there existed more than one thoa there was 
an even number of such niombers# 

Now from the relation 

H^(p^^Q) = h^(p)+Hq^(Q) ~ (l-a;)H^(p)H^CQ) (l.ll) 

we got that =1 if P *Q = PQ because is additive. That is 
P and Q are independent, a ^ » 1# Prom the several calculations 
we had made we observed that a^was closer to unity when P and Q 
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were close to being independent and a^was significantly farther 
from unity if P and Q were far from being independent* A natural 
consequence of our work was to use a ^ to measure the dependence 
between P and Q via P *Q* We had proposed to do it by evaluating 
(a^ - 1)* 

Similarly we had established another measure of dependence, 
(i— a*) by using the subadditivity and superadditivity properties 
of Havrda-Charvats^ measure of entropy* Here we noted that 
irrespective of the type of the distribution, there existed 
at least one a* < i such that for 0 < a < a* H^(P ■«-Q) is 
subadditive, while for a > a'*^, h’^Cp *Q) is superadditive* 



Chapter 2 


A Comparative Study *of Various Measures of 
xrifoi-ina tion 


I'-itrcducticp ; In this chapter we study various measures of 
entropy# directed divergence and inaccuracy# wi-tJi respect to 
their properties^ We talce up these concepts in a section each* 

At the beginning of each section a list of various measures of 
the respective concept is provided* Then a list of properties 
an ideal measure of that concept is ej^ected to satisfy* Finally 
a table with measures against their properties is compiled* This 
helps in finding out which measure satisfies what properties# 
at a glance* 

We only provide proofs or covinter-examples for those 
results which haven^t been proved or counter px'oved* The proofs 
of other results are omitted* 

Section 1 deals with the measures of entropy# section 2 
with the measures of directed divergence and section 3 with the 
measures of inaccuracy* 

2 #1 Measures of Entropy 

2*1*1 List of Measures of entropy : 

The following measures of entropy are considered in this 


section : 
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Let P = (pj_#P 2 / •» be a discrete probability 


n 


distribution, p^ > O ¥ i and s p^ = i. 


i=l 


1 ) Shannon' s [ 56 ] : 


n 

H(p) = - 2 p^ In p^ 

2) Renyi's [65] measure of order a 

n 


‘a' 


H„(P) = In £ p 


i=l 


a 

i 


CL 4 1 


3) Havrda and Charvats' [s] measure of degree a 


n 


H^(P) = -^{2 pj-i} a/i 

i=l ^ 

4) Kapur's [9] measure of order a and type $ 

n 


2 

H dCp) =- T-— In { 1 

I=a ^ n 


) a 1# 


S P 
i=l. 




This measure is independently obtained by Aczel and 
Daroczy [!]• Therefore#it will be referred to as Kapur— 
Aczel and Daroczy measure of infonriation of order a and 
type 3# 

5) Kapur's [l7] four families of measures of entropy : 

(i) =5 ~ 2 Pj^ In p^ + |- 2 £ (l+ap^)ln(l+ap^)-apj; }^a>0 

i=l i=l 


n n 

(ii) = - 2 Pj^ ^ (l+bp^)ln(l+bpj_) 

i=l i-1 


-Cltb)ln (Itb)} b > 0 
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o / 


n 1 ^ 


(iii) H^(P) = - S In E (H-cp^ )ln(l+cp^)-c} 

i=l c issl 

n 1 ^ 

(iv) H-j^(P ) = — E p< In p. + S (l+Jcpj )ln(l+]<pj ) 

i=l ^ ^ K ±5=1 ^ ^ 

- (l+k)ln(i+k)} 

Bchajia axid Cnawlas' [3] r —entropy : 

I - (,Z pi-'V . 

H^(P) = T / 1, T > 0 


l-e 


7 ) Sharma and Tanejas' [ 57 ] two parametric measures of entropy: 


Ci) measure of order a and type /3 i 

n 


h“'^(p) = 


(2 


1 /n 


n 


(ii) logarithm measure : Ht(p) = -2 *" E P 4 In p. 

^ i=l ^ ^ 

n 

(iii) Sine measure : Hg(p) = S sinO log p^) 

i™i* 

8 ) Rathie' s [51] measure of entropy with (m+ 1 ) parameters : 

- 1} a / 1- 


a+^^-l 

"(P) = -rr~ { E 

2^-1 i=»l 


a 


Pi 


9 ) Sharma and Mittals' [58] measure of entropy s 

. a > o. 3 ^ 1* 

10 ) J.c-A-Van der Lubbe et als^ [61] generalized measures : 

n „ _ 

Pi^ 


(i) H“;(p;p/a,6) = - 6 log £ e 

Ii *** 
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n 


(ii) H^(P;p,a,6) =6{1 - (_2 p^)^} 


n 


(Hi) H^(P;p,a,6) == 6{( S - 1} 


11) 


1=^1 


where (p^a) e 0 = {(p,a)lp> 1/ o > 0V0< p< 1, a < 0} 
Kapur'' s [ 40 ] generalized measure of entropy : 


f ^ a-ib , ^ 

( s p. ) ~ ( r pw 

„b i=i ^ i=l 

n5=aTb 


ai^/3,b?^0, a,^>0. 


12) Kapur's [37] three parametric measui"es of Entropy ; 

(i) Hj^(P) = ^ {C^ + (1-X))^“^ *" ■** 

- C S P?(^ + (1-X)p,.)^““ + _S pf(|+ (l->0p^)^“^} 


a~3 i n 


i=l 


(ii) 4^P) = -^7^^ {(| + (l-?0)^^“‘^^^- S pa(^+ (1-X)p^)^^"“^^} 

i=l 


(ill) H^(p) 


Ta-l ) 


In 


ti + U-jOjl-® 


n 


lli 


1-a 


That completes 'the list of measures of entropy* We now 
provide the list of properties of measures of entropy in the 
following subsection • 


2 #1 #2 List of Properties of Measures of Entropy : 

For the justification of these properties, we refer one 
to [ 13, 22, 24, 2 7, 33, 40 ] • All these papers contain discussions on 
the properties of measures of entropy# The following properties 
are verified for the measures of entropy listed in 1#1# Let 



15 


-^n ~ ^ * '** *'Pn ^ ^ ' 


n 

S =* 1} be the set of all discrete 
i=l 


probability distributions with n outcomes and let be any 


measure of entropy for P s A . 

1) Hjj(p) is acontinuous function of the probabilities pj^###<r,Pj^» 

2) is a symmetric function of the probabilities. 

3) zero for degenerate P ^ and this is the minimum 
value attained by Hj^(p) over 

4) H(P) is maximum when all the probabilities are equal* That 

is when P = u e A . 

n 

5) max H (P) is an increasing function of n* 

PsAn 

6) If an outcome of zero probability is added to the experiment, 
the amount of entropy remains the same* This property is 
called expansibility property* 

7) If P e A^ and Q ® independent and PQ is their product 

distribution in A then we have H Cpq) = H (P)+H (q)* 

mn mn n ra 

This property is referred to as additivity* 

3) For any P e A^ and Q e A^ and P*Q their joint distribution 
in A^^ then we have property 

is referred to as subadditivity* 

9) Hj^(p) is a concave function of P* This property of an 

information measure is desired because it ensures that the 
local minimum of it is the global minimum* Also when a 
function is concave, Lagrange^ s multipliers method always 
yields minimum* 
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10 ) If any parameters are included in the measure then it is 
a monotonic function with respect to them# 

11) It is also either concave or convex with respect to the 
parameters * 

We now go to the study of the measures of entropy# 
Analysis 

n 

2»1#3 Shannon's Measure t It is given by H(P) = - S P- In p . # 

i=l ^ ^ 

It satisfies all the properties (1) to (9)» Because there is 
no parameter involved in this measure , there is no question of 
verifying properties (lO) and (11)# Its maximum value is ln(n) 
and it is a concave function of P# It is additive and satisfies 
the subadditivity property# In addition to the above. Shannon' s 
measure satisfies the recursivity property. 


H 


n^Pl'P2'*"'Pn^ =^n-l^Pi'*^2'P3'**''Pn^'^^Pl^2 


2#i#4 Renyi's Measure of Entronv : It is defined as 

1 ^ rr 

^ ^ pp a 1. 

i=i ^ 

It is easily seen that Hq|,(p) -* H(P) as a approaches unity# 
Obviously it is a continuous symmetric function of the probabi- 
lities# It has maximum value equal to ln(n) which is attained 

for the uniform distribution U = Cn'***'n^* It's minimum value 

n -times 

is zero and is attained for any degenerate distribution# Renyi's 
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measure of entropy Is additive but it is not subadditive# We 
deal with this property in complete detail in Chap# 4# But here 
we give an example of A*B*E1-Sayeed [ 5 ] to justify our claim# 

Example 2*1#4#1 : Let a=2, n=2#m=2 

(Tlij^) = ( 0 I 27 ^ (0«i,0#9) and Q = (0#29,0#7i) - 

We then have = i#4822 and H^(P) + H^(Q) = i#4533# Thereby 

u# CX/ CL 

we get that > H^(p) + H^(q)# Thus for a = 2, is 

not subadditive# 

Renyi's measure of entropy satisfies the expansibility 
property# It's concavity property with respect to P is discussed 
in B# Bessat and A# Raviv [4]#The±r results state that (i) H„(P) 
is strictly concave w-r#to P for 0< a<l# (ii) For n = 2#H^(P) 
is strictly concave for 0 < a < 2# (iii) is pseudoconcave 

for values of a > o# 

Kapur [ 34 ] has established that H^(P) is a pseudoconvex 
function of a for 0 < a < 1 and pseudoconvex function of a for 
a > 1# It is also established there that H^(P) is monotonically 
decreasing function of a# 

Renyi's measure does not have an easy recursive property# 

By the following relation between Renyi's and Havrda and Chairvat'^s 
measures of entropy# 

£(a-l)H‘^(p) + 1} 
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and a recursive property for the latter measure/ we can deduce 

a complicated expression for recursivity of H^(P)* We state and 

cx» 

prove the recursivity of degree a for Havrda Charvats' measure 
of entropy in the next subsection* 

With that we conclude the discussion of Renyi's measure of 
entropy* 

2*1*5 Havrda Charvats' Measure of Entropy : It is given by 

f pj - 1! a / 1. 

i=l 

We can easily see that H^(p) tends to H(p) as a approaches 1* 
H*^(p) is continuous, symmetric function of the probabilities* 

Its minimum value is zero and is attained for degenerate 
distribution* H^(p) attains its maximum value for the uniform 

nl-a . 

distribution and the max* value is —s — which is not 

1-a 

independent of a unlike the max* value of H (P)* But again 

”* d* 

max H^(P) approaches ln(n) as a 1 and it is an increasing 
P 

function a* 

Unlike Renyi's and Shannon's measures, Havrda and 
Charvat's measure of entropy is non-additive* For independent 
probability distributions the following relation holds for H*^(P): 

H^(PQ) = t^(p) + i^(Q) + (l-a)H^(P)H“(Q)* 

Now it follows from the above relation that H*^(p) is 
additive for a = 1* But for a. 4 1 unless either of the distribu- 
tions P and Q is degenerate H^(p) is not additive* Therefore 
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h^-Cp) 


is additive only for a = 1# 


Subadditivity is also not universally true for 
In fact we can deduce this fact from the above relation itself# 
If a < 1 we can always find distributions which violate sub- 
additivity rule# We discuss in Chap* 4 in complete detail the 
subadditivity property of h'^(p)# 


Like Shannon's measure and unlike Renyi's measure h‘^(p) 

is concave for all values of a# We can easily verify this by 

n Q, 

considering the fact that ( S p . ) is concave for a < 1 and convex 


i=i 


for a > 1# 


Now we shall consider recursivity for h'^(p)# 

P roposition 2*1#6#1 ; H*^(P) satisfies the following relation for 
all values of a : 


h“(Pi.P2,...,P„) = I^.l(Pl+P2^P3-— 


( Note : This is not a new result but the proof is independently 
obtained by us)# 

n 

Proof t Let ^i* Then we have 

i=l 

h“(p) = {G„(P) - i!. 

We shall first obtain a recursive relation for Gj^(P)* We have 
from the definition 

a ” a 
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Go( 


) = 


Pi + Po . 


2 Pi+P2 ' P 1 +P 2 " (p^+p2^ 


a 


Now 


OC ^9 

( PI+P 2 > °2 +Gn_l ( Pi +P2 ' P3" * " Pn ’ =°n ' Pi " ■ • ' Pn ’ 


(P 1 +P 2 ^ 


a 


Therefore 


P. 


=n-l (Pi-^a 'P3' • '"Pn >+‘Pl+P2 


- (p^+P2) 
and 


a 


(X i 

***''Pn^ “ i-a ^ •**’'Pn^‘’^^ 

“ At fGn-l(Pl-^2'P3"'"Pn> 

- i+(p2+P2)“ SjC^^ . 5p5j) - i > 

= H^_i (Pl■^P 2 .P 3 . .-,Pn)+<Pi+P 2 (5p5j'5p5j> • 

That conflates the proof* 

Kapur [ 34 ] has established that H^'Cp) is a monotonically 
decreasing function of a* He has also established that it is a 
pseudo-convex function of a for 0 < a < 1 and pseudo-concave for 
a > !» But we have obtained the following better result* 


Proposition 2 *1*5*2 s H*^(p) is a strictly convex function w*r*to 
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a for O < a < 1# We can not say anything about concavity or 

convexity of H°^(p) if a > 1^ better than Kapur's result* 

* n 

P roof : Let f(a) = { S p. - 1} then we have 

i=l ^ 


n 


n 


f'(a) = 


f"(a) = 


JT=^ 


a , “ a ^ 

2 p . log p. 2 - 1 

i=l t ^ y and 

(1-a)^ 

^ a 2 ^ a ” 

2 ( log p. ) 2 P^ log p. 2 pJ 

i==l ^ ^4-2 ^ ^ ^2 ^ 


a 


— 1 


( i — <i ) 


( 1 -ar 


(1-a)^ 


Note that the term on the R*H»S# of the above expression is 
always positive and the second term negative irrespective of the 
value of a* But the first terra is positive or negative depending 
on whether a is less or greater than 1# Now consider the case 
0 < a < 1 and rewrite the above expression as 


f'^Ca) = — « ■ S p? £(1— a)(log p. )^+2 log pj i +2 

(i-a)^ i=l ^ ^ ^ 


S P* 
i=l ^ 


(1*K)C)‘ 


Now we shall show that the first term in the last e3<pression 
is positive# For let it be negative# Then (l-a)(log p^) < 2 

log for atleast one value of i# Let that value of i be 
denoted by k# That is 

2 

(l-aXlog p^) < 2 log 

mmff 

or (i^) log p|^ > 1 £as log p^ < 0} 

2 

'I 

or p, > e £if natural logarithms were used} 
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2 

But since we started with 0 < a < 1, which means e > !• This 
is a contradiction to the fact that 0 < < 1 V i = n* 

Therefore our assumption that y 2 p. {(i— a)(log p. ) +2 log p . } 

(1-a)^ i=l ^ ^ 

is negative is wrong* Therefore we get that for O < a < i 
f"('7) > 0 which means that f(a) is convex for O < a < i# 

It is obvious that we can not extend similar arguments 
for the case of a > 1 * 

With that we conclude our discussion of the Havrda 
Charvats' measure of entropy* 

2 *1 *6 Kapur— Aczal Daroczvs^ Measure of entropy or order tl and 

It is given by ln{ E S • We 

^ ^ x~l i“i 

can obtain Renyi*s measure of entropy from taking B = 1 

and Shannon's measure by taking B = 1 and letting a approach 1# 

H p(P) is obviously a continuous/ symmetric function 

di P 

of P^'s* Its minimum value is zero which occurs for any 
degenerate distribution# For uniform distributions its value is 
InCn)* But this is not the maximum value of it for all values 
of a and B» For a detailed discussion refer [9/10/15]* The 
range of values of ct and B for which maximum is as 

follows : 

(i) a > 1/ 0 < B < 1 sod a+B > 2 and 

(ii) a<l/B>l4ri< < 2 
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because when take values satisfying (i) and (ii) above 

is pseudo-concave# 

Apart from Shannon"' s and Renyi's measures of entropy 
^ is the only other measure that is additive# 

Obviously this measure is not subadditive for all values 
of a and j3 . We shall discuss the subadditivity property of this 
measure in detail in Chap* 4# 


Having considered all other properties of ^ now we 
shall move on to its properties with respect to the parameters- 

Proposi"tion 2 *1*6*1 : „(P) is a monotonically decreasing 

^rP 

function of ^ for all values of a* But we can not conclude 
monotonic behaviour of H„ r,{P) with respect to a* 


S P 


at^-l 


P roof : Let f(3) = log 


X P. 


we then have 


(X+*3— 1 


f'(3) 


_i fizLS— 

i=i ^ 


log S Pi log Pi 

_ 1 


s p: 


Case (i)a>l: i»e.,a+3-l > 3 which implies 1. Pi ^ i=l,2/ • . 

But we have O < Pi < 1 ==> log Pi < O ^ i=1^2,»**,n* Therefore 
we now have ' 


E p'?'^^"’^ log Pj > X P j log p^ and £ pT'^’^ * < X Pj 
L=1 ^ ^ i=l ^ i~l ^ "" i=l ^ 


a+3-1 
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which will immediately give us f * <0 because (i-a) < O* 

C ase (ii) O < a < 1 : Presently we have (1-a) > 0 and the rest 
of the inequalities in case (i) are reversed giving f'O) <0 
again* 

Now consider the following example to verify our claim regarding 
the raonotonocity of H d(p) as a function of <X* 

Ou^ P 

Let P = (0»5,0*6), ^ = 2, = 0*5, (I 2 =1*5, = 2, = 20, 

= 50* We get gCoo^^ ) = 0*8646137, g(a 2 ) = 0*2723305, 
g(a 3 ) = 0.5376906, gCa^) = 0*2302494 and gCttg) = 0*2251079 where 
g(a) = Therefore g(a) is not a monotonic function of a* 

That conpletes the proof of Proposition 2 *1*6*1* 

Convexity or concavity of this measure w*r*to any of the 
two parameters is very difficult to establish owing to the fact 
that the expressions involved are very complicated# We shall 
now close oior discussion of this measure* 

2*1*7 Kapur* s Four families of Measures of Entropy 

These measures are given by 


(i) HgCP) 


n 


■ S 
i=i 


In p, + i 
1 a 


n 

S { (1+apj^ )ln(i+ap^)-ap£} 
i=l 


a > 0 


IX * IX 

(ii) ^■^(P) = - E In p^ + -^ { S Ci+bpj|^)lnCl+bp£) 

3d JL 

-(l+b)ln(l+b) 


n t 

(iii) H (p) = - s p. In p, + -^ { E (H-cp^ )ln(l+cpj )-c} 
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n n 

(iv) = - S p . In p. + -^ C S (l+]q). )ln(l+kpj|^) 

i=l K i=l 

-(l+k)ln(l+k)} . 

One salient feature of these four families of measures 
of entropy is that they are constructed with a view to satisfy 
all the important properties of a measure of entropy* Refer to 
Kapur [17 ] for a detailed discussion on properties of the measures 
belonging to these four classes* 

However, we wish to point out an observation regarding 
the minimum value of any of the measures of these families* For 
degenerate distributions they attain their minimum* But unlike 
any of the measures studied so far, this minimum value is not 
independent of it's parameter# For example consider, 

min H^(P) = #*.,0) = I £ (l+a)ln(l+a)-ai - 

p a a a 

which is definitely a positive quantity# And 

It £ (l+a)ln(l+'a )-a} = 0* So if we consider min£min H^(P)} then 
a**© a P 

it is zero# But for any other value of a, the minimum does not 
vanish* But this is undesirable, because when an outcome is 
certain to happen the uncertainty in that expt* should be zero# 

But for this, the measures belonging to these four 
families satisfy most of the properties we have listed#of cotirse 
addivitity and subadditivity are not satisfied* 

Measures of these four families are concave functions 
of p^#P 2 # Family (i) measures are concave with respect 
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to a/ Family (ii) measures are convex w»r« to b. Family (iii) 
measures are also convex w*r^to c and Family (iv) measures 
are concave functions w-r* to Tc* with this we conclude discussion 
on these measures* 

2*1*8 Behara and Chawla's Gamma Entropy : 

It is given by H„(p) =- — ^ { s - 1}'^' 1, r > O 

Hy(p) is a non-negative continuous simmetric function of the 

probabilities# The minimum value occurs for degenerate 

distributions and is equal to zero# By applying Lagrange's 

,1 1 . 

multipliers method^ we obtain that U = tn'***'n'^ ® 

constrained critical point for H^(P) which can only correspond 
to local maximum as minimum is attained degenerate distributions* 
But whether this local maximum is also the global maximum depends 
on the concavity of the measure* However we are unable to 
conclude that is concave for any values of r • 

Hy(p) approaches Shannon's measure of information 
when r -» 1 * If T is non— negative it satisfies the expansibility 
property* 

For P and Q independent we have the following relation; 
Hy(pQ) = H^(p) + Hy(Q) t (l-e'^“^) Hy(p)Hy(Q) 

from which we can easily see that H^(p) is non-additive except 


when y = 1 * 
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H^(p) is not subadditive for all values of Y * This 
wa achieve by providing the following example : 

Example 2*1*8.1 s Let P*Q= and P = (0*1, 0#9) and 

"" OmZ / 0#DO 

Q = (0*29^0*71 ) ■• Let r = 0#i* Then we have 

.(P*Q)= 0*24001, ,(p) = 0*063511 and , (Q) = 0*198670 

and thereby H^^^Cp q) = 0.24001 < 0*26781 = + Hq^^(Q)* 

For y = 0*1, H^(p) is subadditive* 

Again let 'y = 0*5* Then we have 

c(P*Q) = 0*4773052, .(P) = 0*14561 and .(□) = 0*3592 5 

o#D 2*^ 2^^ 

and so H. cCp*Q) = 0*4773052 < 0*50436 = H_ cr^Q^ aaain 

is subadditive* 

But consider y = 1*5* Then we have 

Hi^ 5 (P^«-Q) = 0.26746, f^^ 5 (P) = 0*090511 and = 0.146240 

and hence = 0*26746 > 0.23675 = 

and so Hy is superadditive. For the distribution P^Q we have 
chosen in this exartple, for y = 0*1 and 0.5, is subadditive 
but when we cross over to 1.5, Hy is not subadditive anymore. 
We shall discuss this property of Hy in detail in Chap. 4* 

Because of the very complicated expressions, we are 
not able to decide the monotonic behaviour of Hy(p) w.r* to y 
and also its concavity or convexity properties w.r.to T . Again 
consider the following example. 
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Example 2 •1*8.2 ; Let P*-Q, p and Q be as in Example i«2* We 
get 

= 0.108580, = 0*037065 and HQ^g(Q)=0*092 79 7 

^1.2^^*^^ = 1*182 3908,H^^2^P^ = 0#4l5840 and ^2 ^^^='^*’^20370 
H^Cp^Q) = 3^.1944382, H2(P) = 0.94918, H2(Q) = 1.43563. 

Now we can draw the follov/ing conclusions from the above data 

Ci) q(P*Q) < H_ .Cp*Q) < H, .(P*Q) < H_ Q) 

0.8 0.1 1, .5 0.5 

< H. „Cp * Q) < (P Q) 

Hq.iV’ < < H1.2V) < V'"' 

W°’ •= < W°’ < '"oiS^O' ^ WO' < V°>- 

From the above relations we can easily conclude that Hy^( # ) is 
not mono tonic function of Y either for 0 < r < 1 or Y > 1 . 

We shall now conclude the discussion of Behara and 
Chawlas' Gamma entropy. 

2.1.9 Sharma and Tanejas^ Measures of Entropy 
They are given as below : 

(i) H°^'^(P) = - 2^“^)“^ {I - S p^} , a / e 

i=l ^ i=l ^ 

(ii) H^(P) = -.2^-^C S pf inp^}. and 

i=l ^ ^ 

(iii) H®(P) = { S P? sin O log 

sxn p 
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We shall consider these measures in the order given above 
^ ^Q^s^te of order a and type 3, : 

We can obviously sea that is a non— negative, 

continuous and symmetric function of the probabilities# h'^'^(p) 
is also symmetric in Kapur [48] has thoroughly investigated 

the range of validity of this measure# He has established that 
for this measure the maximum may not always occur for the 
uniform distribution# His result states that if 


^ 1 , ra(a-l) _ ^ 

n n 

oc iS ^ 

then H has a local minimum at U = (^#~/###/^)# However 


n n" 

he has also proved that [31] if a < 1 and /3 > 1 or a > 1 and 
^ < 1, is a concave measure and the maximum occurs for 

U# The minimum value is zero and it occurs for degenerate 
di stribu tion s # 

OL S 

The maximum of H **^(p) for a < 1 and 8 > 1 or vice versa 
is given by 

1 mmuQT % iili» ^ 

ITlclX H ^Vp/ vU/ = I ^ 

P i2 **" 2 

and we can easily see that H^'^(u) is an increasing function of n. 


Now we shall consider the additivity and subadditivity 
(X B 

properties of H ''^(p)# It is not additive# For, if we take 8=1, 
h'^'^Cp) = H^(p), the Havrda-Charvat'^s measure of entropy which 


was shown to be non-additive# 
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cx S ' 

is also not subadditive for all values of a 

and We can give here just one example which verifies our 
claim* But a detailed discussion on this property is included 
in Chap * 4 » 

Example 2 *1 »9 *1 : Let a = 0*5 and ^ * i-*5 and P P and Q be 

as in Exaiiple 1.2* Then we have 

^0.5,l,5(p^ Qj = 1.5157705, 0*53665 and 

j^O*5,1*5^qj = 0.3852'9. 

Therefore '^(P 4<- Q-) = 1*5157’705 > 1*42234 * H°»5,i*5,^p) 

+ Q')* Hence for this set of values is not 

subadditive • 

•Now we shall discuss the properties of this measure as 
function of its parameters* 

Now we shall prove the following result regarding 

CX ' 

concavity and convexity of H “''^Cp) w*r* to 0 C and 
Proposition 2 •1*9.1 s H^'^(P) is 

(i) Convex w*.r*to a if a < 1 and ^ > 1 
kii) Concave w'*r*to a if ct -> 2 and 3 < 1 

•(-iii-) Concave w.ri*to /3 if a < 1 and > 2 and 
(iv) Convex w.r.to p if a > 1 and ')3 < 1 . 

n 'H 

.Proof -j Let fia) = (2^ "^'-2^“'^)’"^ ( -S P?“ pf )-* Then we have 

.i=:l -i^i 
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f'(a) =s 


S pj log p. 2^"^( S pj - 2 p?) 

1=1 ^ ^ i=i i=sl ^ 

(ji-a _ 


f''(a) = 


2 p^dog p . 
1=1 ^ ^ 


o 2 - 0 C " 

2 2 

l=j. 

( 2 i=orpr=; 


2 pv log 
1=1 ^ 

— — 


and 

Pi 




2 ^-^ ( 2 p^ 
1=1 ^ 


" 3 - 


Cl) a < 1 and 0 > !• We recall the arguments of 1#3*2 and 
conclude that the sum of the first two terms in f^'Ca) 

is positive# Then we know that {( S p?- S pj)/(2^ *^-2^ 

1=1 ^ 1=1 

is always positive# (1-2 ”^) is negative because a < 1# 
Therefore we get that f'^Ca) > 6# That is f(a) is convex# 
(ii) a>2^^<l« Then we have obviously the first two terms 

of f"(a) negative# Again £( 2 P^"* S p^)/C2^’*^— 2^ 

i=i ^ i=i 

O wmmfY 

is positive:# But now (i-2 ) is also positive making the 

third term of f’®'(a) also negative# Therefore we now 
have f^Ca) < 0# That is £(a) is concave# 

Now let g(^) = (2^*^ - 2 pj - S p?)'# Then we have 

i=i ^ i=l ^ 


g'O) = 


721 -“ _ jl-p, ( 7 . 1 -“ - 


i=i 


g" 0 ) = 


“ 2 P^Clog p^)^ 

7 ^^oqr=F 7 - 


2i-^( 2 P?- -2 pf )(1-22-^) 

i=i ^ i=l 


n 


and 


(2 


.= x r=A ■■ 
l»-a 
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(iii) Let a < 1 and /3 > 2* Then we can easily see that 

(2^ *^-2^ ^) > O and so the first term in is negative 

and by the same arguments as in (i) and (ii) above, the 
second term is also negative# Therefore g"(^) < O and 
g(^) is concave# 

(iv) Let a > 1 and /3 < 1 with wimilar arguments as above we 
obtain that g"(/3) > 0 and so g(p) is convex# If completes 
the proof# 

But we are unable to decide the nature of f(a) and 
gO) in the cases (a) ^<1, i<a<2 and (b) a < 1, 1 < 3 < 2* 

Although we could decide about the sign of the second 
derivative of f(a) and g(^) we are unable do so for f'(a) and 
g'(^)* We know that /3 = 1, reduces to Havrda and 

Charvat's measure, which is a monotonically decreasing function 
of a with that we conclude our discussion of this measure* 

(ii) Logarithmic Measure of Entropy : 

Hj^(p) is actually a limiting case of as /B -*■ a* 

We can easily verify that* It is also easy to see that Hj^Cp) 
is a continuous symmetric function of the probabilities* Hj^(p) 
is always non-negative* And vanishes for any degenerate 

distributions and hence the minimum value of it is zero# 

Kapur [ 41 ] had established for U = ^n'n'*’**'n^ ® 

local minimum if a > 1 and n > exp { {2a-i )/a(a— 1 )} * But once we 
take the care that these inequalities are not satisfied, we have 
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Hj^(p) being maximum at P = U* Wa have 

max H^(P) = H^(U) = 2°''"^ log n» 

Let us denote by f(n) = H^(u)# VJe then have 

f' (n) = 2^ ^ n (1-a) log n t 1} i 

1 

f'(n) > 0 <==> (l-a)log n > -1 or n > 2*^'"^ if O < a < 1 

1 

n < 2^“^ if a > 1* 

Therefore the maximum of H^^CP) i's an increasing function of n 
iff it satisfies the above conditions* For exan 5 )le if a = 0*5 

then n should be atleast 5 and if a = 1*5 then n can not be 

greater than 4, in order that f(n) be an increasing function 
of n« 


Now we shall consider ‘additivity and subadditivity of 

(P) * 


Additivity Fob P and Q independent and PO being their product 
distribution we get the relation 

m ^ n 


H^CpO) - C 2^qj) %(P) t ■( t^pj) H^CQ^) 


■a. 


j=i 


i=l 


Therefore we can see that %(P) is not additive in gvenerai unless 
a = 1 in which case Hj^Cp) reduces to Shannon's measure* 


Subadditivity : We prove the following result on the subadditivity 
of Hj/Cp')* 

Proposition 2'*1’*9*2 is subadditive i£-ia » 1 and not 


subadditive for a < 1 • 
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Proof : Let {tIj j-1, • . ./tn be a joint probability distribution 

i— X ^ f n 

and let P = and Q = be the corresponding marginal 

probability distributions* We now have 


71 


ij ~ ^i^ji 


.th 


where is the conditional probability that j event of the 

til 

second distribution occurs when it is known that i event of 
the first distribution occured* Now let a > 1, we have 

n m 


(P Q) = -2 


a-i 


= -2 


a-1 


S S (p.q..)^ log (p. q..) 
i=i j=l ^ 

S p“ log p^C S E p“ I q“i log 

jl"“l i*:i 




(X—1 ^ a ? a « a T 

E p. E Pj 2 q.. log q.. 

i=l ^ j=l ^ j=l 


m ra 

f because E < S q^^ = i] 

L j=l j=l 

= Hi,(p) + E p“ HL(Q/Pi) 
i=X 

^ Hj^(P) + Hj^(Q) just as in the case of Shannon's measure* 

This proof does not hold in the case of ot < 1 because if 
n ^ 

a < 1, E P^ > 1* 
i=l ^ 

Example 2 *1 *9 *2 : Let P and Q and (P * Q) be the same as the 
distributions taken in Example 2 *1*4*1 


(i) a = 0*5 : We have Hj^(P* Q) = 0*710829, Hj^(p) = 0*2 543018 
and H^Cq) = 0*2933357* Therefore we have 
H^CP^frQ) = 0.7108229 > 0*5476375 = H^(P) + Hj^(Q)* 
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(ii) a = 2 : We now have Hj^(P*Q) = 0*2 575902 ^Hj^(P ) = 0#0941271, 
Hj^(Q) = 0.2403359 

Therefore Hl(p*q) = 0.2575902 < 0.334513= H^Cp) + Hj^(Q)» 

The results in this exan^le are obviously consistent 
with those in Proposition 2 .1.9 .2. 

Now we shall finally consider the properties of Hj^(P) 
as a function of its parameter. 

a-1 ^ a. 

P roposition 2.1.9. 3 : Let f(a) = ~2 I! Pj log p_. • Then f(a) 

i=l ^ ^ 

is monotonically decreasing and convex fxanctions of a. 

Proof : We have from the definition 

f'(a) = -2^"’^ 2 p?’ £log p.. + (log and 

i«l ^ ^ ^ 

f*(a) = -2°''’"^ s. log p^{l + log ^ 2(log p^)^}* 
i=l 

'2 

Obviously, (log p^) is greater than 1 log p^l and cherefore we 
get that f'(a) < o and f‘*'(a) > O.i.e#, f(a) is a monotonically 
decreasing convex function of ct. That con^letes the proof. 

With that we conclude the discussion on Sharma and 
Taneja's logarithmic measure of entropy. 

(iii) Sine Measure of Entropy : 

This measure does not satisfy most of the required 
properties of measures of entropy listed in 2.1*2. 

The foremost drawback of H.g'(P) is that it can be negative 


also. For consider the following 
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Example 2*1 *9 ,3 : Let a = 2, j3 = -1.348, n * 2, P = (0.2, 0.3)* 

Then we have 

Hg(p) = -0.5703569 < 0. 

M aximum and Minimum : 

Fvcnthough Hg(0,l,0, ^ »*,0) = 0 we can not claim that 
zero is the minimum value of Hg(p) because Hg(P) can also assume 
negative values* Kapur [41’| has discussed about Hg(p) being 
negative. But we are not able to obtain the absolute minimum 
value Hg(p) attains. 

We can not also prove that Hg(P) is maximum when P = U 
or if the value Hg(u) is an increasing function of n. We can 
not prove or disprove any of the other properties like concavity 
with respect to P monotonocity and concavity (convexity) with 
respect to either of the parameters, all due to the extremely 
complicated expressions. 

However we can conclude that Hg'(p) is neither additive 
nor subadditive. 

Example 2 .1.9 .4 : Let (X = 2, P = 1.5, P = (O.S.O.S), Q = (0.4^0. 6), 
P ” C PQ,thc product distribution. Then we have Hg(PQ) = '—0.131^ 
Hg‘(p) = -0.999 and Hg(Q) = -0.938. Therefore we -get that 
HgCP^'s-Q) = -0.131 > -1.937 == Hg(p)+Hg(0), violating the sub- 


additivity law* 
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2 *1 #10 


Entropy wa 


jarametex's : 


It is given by H 




'(P) = 


n a+^j-l 
E Pi 

rr 

E p/ 

i=l ^ 


Note that if we take 3i=p2=-- *=3j3 ^ It we get Havrda-Charvat 
measure of entropy* This measure is not a symmetric function 
of the probabilities c Hoxvever it is a non-neoative continuous 


function of the parameters* 


We shall now consider the maximum and minimum values 
of this measure* Obviously the minimum value is zero which is 
attained for any degenerate distribution* 


We shall by way of an exanple show that Rathie's 
measure does not always attain its maximum for uniform 
di stribution » 

E xample 2«1»10*1 : Let n = 2^ ‘^2 2 and 

= (0a5^0*5) and P 2 = (0*4, 0*6) .Then we have =1*0 

and H 2 *^'^(p 2 ) = 1.0052668. Therefore we have ) < 

H 2 * '■*‘(P 2 )<» So this example justifies our claim that 

(P) does not always attain its maximum value for 
uniform di stribution s« And this is obvious because of the non~ 
symmetry of H (p) w.ri*to P. 

There is a problem in proceeding for either proving or 
disproving additivity and subadditivity for Rathie'^s measure* 
Because in P* Q thex'e are mn components^ we need mn parameters 
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besides a* But we only have n parameters from the 
definition of the measure* Unless this question is resolved 
the additivity and subaddiuivity properties are not defined 
for this measure of entropy* In fact it is a serious draw back 
as we often need to measure the entropy of a joint 
probability distribution as well as the marginal entropies# 


Also the ej^pansibility property is not defined for 
this measure because of the same problem as above stated# 

Because (p) does not have the sum property 

which is possessed by each of the measures considered so far^ we 

are unable to put forth any arguments for either proving or 

* ** * 

disproving the concavity of (p) with respect to P# 


Wa shall now consider the monotonic behaviour of it 
w»r»to the parameters* 

Proposition 2*1 ■*10*1 : (P) is a monotonically 

increasing function of cx for a > 1* For O < a < 1, it is not a 
mono tonic function of a* 

.8< f 


Proof : Let f(a) = 


m 


(P). 


Then we have 


« a+3.~l n a+^.-l . 

(2** “-!)( S Pi log2Pi) + Pi ) 2^ ^ 

f'ia) = " - x — + 

( 2 p.^) 

i=l ^ 


.l~a 




Now let a > 1 then we have (2* -1 ) < 0, log Pi £ 0 ¥ 1=1,2, •••^n 

and therefore we get that f'(a) > 0* 



39 


But for 0 < a < i we can not arrive at a definite 

n a-hp.-l 

conclusion as we can not estimate which of 1 2 p. log I 

. X X 

and 2 p. is greater* Therefore no conclusion about f'(a) 

for 0 < a < 1 can be drawn# 

^ ^ 

Now if we consider vP ) we then have 


k. , ry.t _ ri _ 


^a-i 


g^Ov) = 


log p^{pj“^ S Pi - E Pi 


) 




2 

(S Pih^ 


, 1 < k < n 


But here also we face the same dilemma about the relative 

0,-1 n ^i n at^i-i 

magnitudes of (p, S P,- ^ and s P,* ^ 

^ i=l ^ i=l ^ 


With that we conclude the discussion of Rathie's 
measure of entropy with (n+i) paramerers* 


2 *1 *11 Van der Lubb.e et al* Measures of Entropy : They are given 

by 

1 ^ (7 

(i) H~(p;p,0,6) = -6log [ E pF 1 

i=l ^ 

(ii) H^(p;p,a,6) = 6[i - ( S pf)^] and 

n jL— 1 •*• 

3 n 

(iii) H;;(p;p,a,6 ) = 6[( S pO” -1] 

n 1=1 

where (p,a) e d = {(P,o)lO < P < X^o < 0 V p > > 0} and 6* a 

positive normalizing constant* 

Unlike the classical approach where measures of 
entropy are derived by characterizing the uncertainty of a 
random variable. Van der Lubbe et al* [61] characterized the 
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cer-^ainty of a random variable and then considered mohotonic 
decreasing functions of this certainty measure for measures of 
entropy* 

The following are the axioms assumed by Lubbe et al* 
for obtaining the measures stated above : [Thm* 4 [61] ] • 

I 

\ ^ 

Ci) For stochastically independent experiments X and Y it 

holds that: 

H(PQ) = H(p) + H(Q) + c H(P)*H(Q)^ c e R, 

(2) The information measure H(P ) is non-negative and a 

continuous and strictly monotonic function of the certainty 
measure f (P ) • 

Then they proved that the only non-trivial solutions 
H(p) are as follows : 

(A) For c = 0 > H^(p;p,a., 6 ) 

(B) For c < 0 , and 

(C) For c > 0 H^(P;P,a, 6 )* 

So it is evident that only the measures belonging to 
family (i) are additive* The measures belonging to other two 
ramixxes are non-additive* 

They also obtained the following results 5 

(1) [corollary 4 of [ 6 I]] max P/h, 5 ) * -'(1--P)cf6 log2n 

raajc K^(P7p.,a^6) = ] 

max H^CP'/ p.,a, 5) = 6 
i 
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and min H^{P;p#0,6 ) = H^(l,0,0^ #*.,0/P,a/6) = 0. j = 1,2,3. 

(2) [corollairy 5 of [6i]] (i) For fixed a and 6 it holds that 
H^(P*P/0/6)/ i = 1,2,3, are 

increasing w.r.to P for P>1, 0>O, 6>0 and 
decreasing w.r. to P for 0 < P < 1, a < O, 6 >0. 

(ii) For fixed p and 6 it holds that H^(P/p,a,6), i = 1,2,3 are 

increasing w.r. to o for p > 1, a > 0 and 6 > 0 and 
decreasing w.r.to a for o<p<i,0<O and 6 > 0* 


Now we proceed to prove the following results : 


Proposition 2.1.11.1 : 

(i) H^(P;p,a,6) is a concave function of P for 0 < P < 1 and 

a < o, 

(ii) There exists a s.t. for p > 1 and a < a^, H*(p;p,a,6) is 
concave with respect to P and for 0 > it is convex 
w.r.to p , 

(iii) H^(PfP,0,6 ) is both concave and convex w.r.to 0, for 
(p,0) e D- 

Proof : Let P/^f, 5 )* Then we have 

f ' ( p ) =s - 6 [ 2 p, ] 2 p . log p, and 


f[(p) 

(i) 


= -6C 




i=i 


n p?* 2 ^ 

j]— 0 ( 2 ' log Pj ) + S 

i=i s p£ i=i 2 P'i 


^(log p^)^ ] 


Let 0 < p < 1 and 0 < O. Then we can see immediately that 

1 

(P) < o and therefore H^(PfP,0, 6) is concave w.r. to 
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in this case* 

(ii) Now let p> 1 an da > O* Then we have from the convexity 

. 2 
of X X 


n pP 

( s 


- n Pf 

, , n . ^°9 i p ( 1°9 Pl^' 

1=1 s pP 1=1 S pj 


i=X 


n P 


Now let = 


" 1=1 ^ ''i’' 


S 2 

1-1 - 5 ”? 

i=i ^ 


* Then we can see easily 


n p; , n p7 

that 1 -a ( r — ( log p . ) ) 1 ^ I s ^ 

i«l SpJ i=l Sp^ 


2 

(log pj^) 1 depending on 


a < a^» Therefore f£'(P) ^ 0 according as a ^ a^* If a = a^^ then 
we have that f^ ( P) = 0» That is H^CP/P^a^S) is both concave and 
convex# 

(iii) Now let g^Ca) = H^(P/P/a,6)# We then have 

n 

g£(a) = - log 2 2 P^ and - 0 which contpletes 

i=i 

the proof of Proposition 2#1#11#1# 

Proposition 2#l#il#2 : 

(i) ]H^(P/P,a,6) is a concave function for b > ^2 ^ convex 

function a ^ 02 w#r# to P where (p^d) s D and 02 ® 

(ii) H^(P?P#0,6) is a concave function w»r. to 0 for iPfO) e D. 

XX 
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Proof : (i) Let ^2^^^ “ Then we have 


f ' CP ) ss -6 0 ( 2 S Pj log p. and 

^ i=l ^ i=i ^ ^ 


f4"(p} = -60 [ s p'J]‘^[(E -Si- log p. )^(a-i) 
i=l ^ i=l 2 pj ^ 


n pP 


4* S 


i=l s P? 


(log Pj_)^] 


Now let 02 = 1 - 0^ = 1 - — —i- 


S-Pf , s2 

‘ i (log r 

2 P 


n 




( E 
±=1 2 Pi 


log pj^)' 


Now we can clearly see that for 0=!02 :^(p)=0 and 

for 0 02 f 2 ^(P) > 0 and for 0 > 02 ^ ,< 0 , again making 

2 

use of the convexity of the ftmction x - x • 


Note here that 02 < 0 and 0^ > O* Therefore for all 

p > i and (p, 0 ) 6 D i4(p;p/p^ 6) is a concave function of P and 

there corresponds point P 2 to ^2 that for 0 < P < P 2 .» 

2 

H^(p?P, 0.,6 ) is convex and for P > P 2 it concave w»r* to P * 

(ii) Let g^io) = H^(PyP,&> 6 )» Then we have 


g^io) = ■-6[ _S^ Pi^*^ log ,[ 2 pj] SLi^d. 



We can clearly see that gj^ia) .< 0 for all 'B That 

coiT^letes the proof of Proposition 2 *l-»ii^'2# 
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Propositi can 2.i#il#3 : 

3 

(i) i^(P*P,a^ 6 ) is a concave function w*r.- to P for cr < <^3 

and a convex function w#r# to P for P > where P 3 = 

3 

(ii) H^(P*P/P# 6 ) is a convex function w*r» to o for all ( P,a)e D» 

Proof : (i) It is exactly similar to that (i)# Proposition 
2-1 #11 #2 thus we omit it here* We note, however, that > 1* 
Therefore we have the relation 5 

02 < 0 < 03 < a^* 

O 

(ii) Let gia) = H^(P#P,a, 6 )« Then we have 

n ^ p 

g'(a)=- 5 [s log E and 

i=i ^ i«l ^ 

g"(a) = 6 [S (log ^ 

i=i i=l 

Therefore g"( 0 ) > 0 for all (p, 0 ) e d. That coitpletes the 
proof of Proposition 2*1 •11 *2* 

We have now completed the study of the properties of 
Lubbers measures of entropy w*r# to their parameters* Now we 
shall deal with the Concavity of these measures w*r* to P e A^* 
we have the following 

Proposition 2*1 ♦11*4 : nJ(P 7P/P#6) is pseudo-concave, concave 

and pseudo-concave for i = 1,2,3 respectively for ( 6 , 0 ) s = 
{P,0lO <P<1, 0<OVP>1, 0>1}* 

Proof : Theorem 2 of [61] states that for (p,0) e D^, the 

n 

certainty measure [ S p?] is convex with respect to P s A • 
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Now making use of the following results,[l5"| we get our results : 

(i) log (convex function) is pseudo-convex 

(ii) Negative of a convex (pseudo-convex) function is a 
concave (pseudo-concave) function and vice versa# 

(iii) Concave function /con vex function is a pseudo-concave 
function « 

That completes the proof of Proposition Now 

we shall consider the subadditivity property of these measures# 
None of the three measures is subadditive# Consider the following 
examples : 

E xample 2«i#il#l ; Let 5 = 1, P =* 2, a = 2, Q, P and Q are 
same as in Example 2.1*4#1* We then have 

H^(P * Q;2,2,i ) = 1*4821554,, f^(P?2,2,i) = 0*39e9O18 and 

H^(Q;2,2,1) = l'.061.3'76'5. 

Therefore H^(P* Q;2,2,i ) > i4(p,;2,2,i 5 + (Q(r 2 ,, 2 ,l')# 

Example 2*1*11*2 : Let 6=1., P = 2„ 0 * 0*05^ P Q, P and Q 
as in Example 2*l*4#i-# Then we have 

0;2^ -05.1 ) ^ 0.0363757,, (P 2 ; 2 „ 0 # 5 ,l ) = 0*0261854 and 

H2(Q;2.,-#05.-,1') = 0*0098734 

Therefore h|(P Q;2, »05,1 ) > (Pj2r, #0S,1 ) t •M^'(Q,72,*05.,i.) • 

- Example 2 #1.11 #3 : Let 5 = l, p = 2., a ^ 0*05„ P « (a-5.,0#5), 

Q = (O #4-/0 # 6 ) and P*Q = PQ the product distribui:ian..* Then we 


have 
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h|(P % Q;2,0^05,i) = 0»0696737# h|(P? 2 , .06>i ) = 0*0352649 and 

H^CQy2,#05,l) * 0*0332367 

Therefore h|(P * Qf2, 0.05,1 ) > h|(p? 2 , 0*05,1 ) + H2(Qf2, 0*05,1 ) ♦ 

Therefore we had seen that the measures belonging to 
each of the three families proposed by Lubbe satisfy all the 
properties of entropy measures we listed in 2 #1*2, exc^t 
the additivity and subadditivity properties* With that we 
complete our study of Lubbe et al*'s measures of entropy* 


2 * 1 *12 


Sharma and Mittal s* Measure of Entropy s 
It is given by ‘ 


1} a > 0, 


a b?^0»H , (P) is a generalization of H*^(P) which we will 

a ^ D 

obtain by taking b = 1* Kapur has discussed the validity of 

established that 

(i) ^(p) is always non-negative 

(ii) Min of H_ is zero and is attained by degenerate 

3. g Xj 

di stribu tion s 

(iii) Max of always occurs for the uniform distribution* 

(iv) is non-additive* 

a,D 

(v) H , (P ) is concave w*r* to P for a>iVb>lor 

a, b 

0<a<iV0<b<ior for a < 1, b- < 0* 

(vi) H Cp) is a monotonic decreasing function of a V b* 

3g D 

We can not conclude the monotonic behaviour of b^^^ 
w*r* to b and concavity or convexity of b^^^ also equally 
complicated* This measure is of interest because it generalizes 
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Havrda-Charvat^ s, Renyi's and Shannon's measuras of entropy 
without loosing many of their properties* The only properties 
lost are additivity (valid for Shannon and Renyi measures 
only) and subadditivity which is true always for only Shannon's 
measure and concavity with respect to P for some values of a and 
b/ for example a > 1 and 0<b<i or0< a< b>i« 


With that we close our discussion of Sharma and Mittals' 
measure of entropy# 


2#l#i3 Kapur's Generalized Measures of Entropy : The measures 
we consider in this subsection are given by 


C E p?)’" - ( E pf)^ 

(i) 


0~a)b 


a 5^ 3, h 9^ Of > 0 


(ii) l4.(P) 




(iii) Hj^(P) 




i»l 


(iv) H^(p) =ra^ln 




S p“(| H- (l-Wp^) 


1-a 


Now we shall consider the properties of these measures* 

(i) The iiiportant fsatxire of this measure is that it generalizes 
many T<nown measures of entropy# For b = 1 we get Sharma 
and Taneja's measure of order a and type for b -* 0 
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wa gat Kapur, Aczel and Daroczy's measure of order a 
and type 3/ and other measures like Renyi'^s, Havrda- 
Charvats', Sharma and Mittals^ and Shannon's are all 
special cases of this measure. 

Except for the additivity, concavity w.r# to both 
P aiia uhe parameters and the subadditivity, 
satisfies all other properties which we have listed 
in 1 <-2 p 

The additivity is satisfied for the special cases whan 
Shannon's or Renyi's or Kapur-Aczel Daroczys' measures 
result o 

(ii) Based on the definition H (p) = max D(PsU) - D(P:U), Kapur 

P 

has given a systematic method of obtaining entropy 
measures from valid Directed Divergence measures. This 
three parametric measure is thus obtained by 

constructing a three parametric directed divergence 
measure : 

n p. 

D(P:Q) = 2 q (f(^) - f (1 ) ) 

i=l ^ % 

where fCx) is a twice differentiable convex function* 

By considering special cases for f(x) in D(P?Q) and the 
above definition for H(P), Kapur [s?] obtained 
hJ(p), H^(p) and H^(p)# 

All these measures, by their definition satisfy 
all the basic properties of measures of information, like 
non""negativity continuity and symmetry with respect to the 
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probabilities, minimum for degenerate distributions and 
max. for uniform distribution and the maximum value is 
an increasing function of n* They also satisfy the 
expansibility property* 

With that vre conclude our discussion on measures 
or eiicropy and close this section with a table with 
properties listed in 2*1 #2 against the measures listed 
in 2 *1 *1 o 


2 •! *14 Measures of entropy and their Properties : 

In the following table we have measures of entropy listed 
in 1*1 in rows and their properties listed in 1*2 in columns* 
Tne following symbols are made use of t 


Yes 

NO 

Yes/ 


X 


if the property holdsfor the measure 

if the property does not hold for the measure 

if the property holds conditionally or for a 

restricted range of the parameters ) of the measure, 

if we are inclusive about the validity of the 

property for the measure* 

does not apply* 
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Properties 


Measures 


Yes Yes 


Yes] Yes Yes 


Yesj Yes 


Yes Yes No 


(ii) 


Ciii) Yes Yes No 


(iv) Yes Yes No 


Yes Yes Yes 


Yes Yes Yes 


Yes 


Yes 


Yes 


Yes Yes Yes 


5 6 


Yes Yes 


Yes Yes 


Yes Yes 


Yes 


Yes 

Yos 

Yes 

Yes 

Yes 

Yes 

i Yes 

Yes 


Yes/No 


Yes/No 


No 


No 


Yes 


Yes 


Yes 


Yes Yes 


Yes: Yes 


No i Yes 


X 


Yes Yes 


Yes- Yes 


Yes Yes 


Yes Yes 


Yes Yes 


Yes No 


No No 


Yes No 


No I No 


No 


No No 


No No 


No No 


No iNo _ 


No Yes/No 


No 


X 


No , No 


Yes No 


No ' Mo 


Mo 


No }mo 


Yes/No 


Yes/No 


Yes/ j Yes 


Yes/ I Yes 


Yes 


Yes/No — Yes/No 


Yes Yes 


Yes/Noj Yes/No 


Yes/Noj Ygs/No 


Yes^o 


Yes/No: 


Yes ■ No 


Yes, 

Yes, 

yeS; 

Yes , 

Yes 

1 

Yes 
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^ Measures of Slrected Divergence 
2 #1 List of Measures of Directed Divergence 


Let P = (p^/P 2 , ***/p^) and Q = (q^/q2^ 

two discrete probability distributions with y p. = i and 

i=l ^ 

^ ;! ~ /HI. Then with the convention that 

0 log 0=0/ we have the following measures of directed 
divergence, 

i) Kullback—Leiblers^ Measure [^45] 


n p. 

D(p;Q) = E p. In 

i=l ^ % 


2 ) Renyi^.s Measure [55] 


( 2 . ) 


n 


3) Hayrda-Charvats'^ Measure [S] 


<X 1 


(2 # 2 ) 


D“(PfQ) = 5^1 I Pi 4~“ "4 “ ^ 

1 


4 ) S harma and Guptas' Measures [59] 

(i) D^(P/Q:a/p) = 2 ’"^ S p“ gf In ^ 

i=sl ^ ^ % 


(2*3) 


(2 -4) 


(ii) D^(P/Q:a,0,r) = 


i*i 


1 “X 


(2*5) 


(iil) D®(P?Q:0,y) 


n 


sin 


S P 


P; 


i=i 

— R ^ Y 


/ v^a sin (y In (2*6) 

11 <ji 


1062 ca 

cc. JvH. • *■ 
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5) Ferraries Measure [7] 


n 


Dj^(p;q) = I E (1 + Xp^) In 


(1 + Xpj^) 

(1 + Xq^) 


(2 *7) 


6) Kapur's Measure [is] 


( 2 * 8 ) 


7) Rathie's (n+1 ) parameter Measure [52] 


D 




m 


a 


. n a+^.-l twv' ^ 


(2.9) 


S) Kapur* s Measures [26/27/43] 


n 


11 Pj *f ^ i ^P4 

(i) D^^CPfQ) = _S^ p . In ~ - ^S^(l4.api)ln(|~p^).a > -1 

( 2 . 10 ) 


(ii) Dq(p;q) 




/ / ^ a l-a-ik , V 

(( E P/q,. ) - 1^ 


( 2 #11 ) 


i=i 


X -"X 


a 5 ^ k 5 ^ 1/ S/b/ and c are real numbers 


a 1-a, 


( 2 . 12 ) 


(iii) d^(p;Q) = { (P 

where (p(») is a twice differentiable convex function. 


^ P-! V 

(iv) D (p;Q) = W{ S <p(:“)l where ^(.) is a twxce 

W M to j -J- M.4 


W* <P i=l - 

differentiable convex function 


(2.13) 
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2 #2 *2 • List of Properties of Measures of Directed Divergence : 

The following properties are checked for the directed 
divergence measures listed in 2 »i # If D(P/Q) is a directed 
divergence measure uhen/ 

1 ) D(p;q) is non-negative 

2) D(p;Q) = 0 if and only if p = Q» 

3) d(p;Q) is a convex or pseudo-convex function of P and Q. 

4) Additivity : If P and Q and R and S are pairwise independent 
probability distributions then 

D(p •»Q;R^5>S) < D(p;r) + d(Q;s). [Here P *Q = PQ and R-JS-S = RS] 

5) Suba ddi ti vi ty : If P,Q,R and S are any probability 
distributions then 

D(p* q;r*s) < d(p;r) t d(q;s). 

6) For 0 < X < 1 and P and Q any probability distributions 

£(X) = D [p; XQ + (l-X)p] is an increasing function of X • 

7) If any parameters arc involved then d(p?Q) is a monotonic 
function of each of them* 

8) If any parameters are involved then D(P/Q) is either 
concave (pseudo-concave) or convex (pseudo-convex) with 
respect to each of them* 

Analysis 

2-2*3 Kullback-Leibler Measure of Directed Divergence 

D(PjrQ) satisfies all the properties listed in 2-2*2 
except for the properties with respect to the parameters* In wh?>t 
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follows/ we shall verify (i) Subadditivity and (ii) the 
monotonically increasingness of f(X) = D(p;XQ -l-(l-?v)p ) * The 
latter is proved by Kapur [ 14 ] • 

Subadditivity of D(P?Q) : Let and R*S be any two joint 

probability distributions with P and Q and R and S as their 
respective marginal distributions# Then we have 

n n p. q.. 

D(P»Q;R*S) = S S p, q.. In 

i=l ^ ""i ®jl 

where (P*Q) = '=<Pl‘*Jl>nxn = ''‘ij’nxn 




n 

r p. 
i=i ^ 


In 


P^ n 

r^ ^ ^ii 
^i j=l 


n n 
4* E p/ S 
i=l j 




In — ii 

"Ji 


n p n 

= E Pj In ^ + S p . 
i=l ^ i=l ^ 



n 

= D(PJr) + E P/ d (q/s/p. ) 
i=l ^ ^ 


< D(pyR) + d(q;s)« 


That proves the subadditivity of D(P/Q)« Now we shall reproduce 
here the elegent proof of Kapur# 


Let f(X) = D£py XQ+(1-?0 p} 


n Pi 

* S (Xq,.>(l~X)pi) «P( — ) 

i=i Xqj. + (i-x)p£ 


where <P(x) = x In x# 
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We know that <p(x) is a convex fimction and so ^"(x) > 0<* 
We have 


f'"(X) 


i=i ( 


(P*' (- 


P. 


qj_+(l- )p^ 


■) > 0 < 


Because 


n p.(q.-p.) 

S — ± — = — i — — is zero when 
i=i { qj_+(l- )p^l 


X = 0, we get 


that f'(x) >0 which is the desired result* 

With that we conclude our discussion of Kullback-Lsibler 
measure of directed divergence* 

2*2»4 Renyi^s Measure of Directed Diverqance : It is given by 


We can easily observe that D^(p;Q) is non-negative by making 
use of Renyi's inequality [iS], viz* 

S P? <4*^ < ^ according as a > j. ^ S pv = S q. = 1 
i=^i i=l ^ i=l ^ 

(2*15) 

? a 1— a , x: 

D (pJQ) is ps8udo*-convex becaus© S p * eg;. is convex for 

a ^ ±ssX 

I 

a > 1 and is concave for 0 < a < 1# and log (convex function) is 
pseudo-convex function and log (concave function) is pseudo- 
concave function- Now if a > 1, (a-l) is positive and if a < i# 
(a-1) is negative* Again negative of a pseudo-concave function 
is pseudo-convex. So that makes, for all a, Renyi's measure of 
directed divergence is pseudo-convex* 
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Exampla 2 *2 *4 -1 


Let P Q = 


/0.02 O.OSn 
^0.27 0.63 


and R S 


/0*03 0*06 \ 
^0.30 0.61 


(i) a = 2. Then we have 


D 2 (P Q;r -“- s) = 0.013563333, D 2 (P/R) = 0-0012202 5511 and 
D2 (Q;r) = 0.0072104361 

Therefore D 2 (P ^5- Q;R s) > D 2 (p;r) + D 2 (Q;S). 

(ii) Let a = 0.5. Then we have 


(P Q;R^^- S) = 0.5056, D^(p;r) = 0.1226 and (Q;s) = 0-3316 

r 7 1 

again we have D^(P •K-Q;Ri«-s) > D^(p;r) + D^(Q;s). 

7 7 7 

We have made calculations using various distributions for 
subadditivity of Renyi's measure of D.D. our results are 
tabulated low ; 


: P Q 


/ 0.2 

^ 0.2 


' and R* S 

/ U •O 


= ( 


0.1 

0.1 


/ 0.6 s 

, 0.2 


a 

D^(p •if Q;r *S)-Dqj_(P;R) - Dj^(Q?S) 

0.110 

0.22 70182 X10“^ 

0.510 

0.6353982 X10“^ 

0.960 

0.29 33892 Xlo“^ 

1 .010 

-0.1178731 xl0“^ 

1.510 

-0.2481915 xio“^ 

1 

2.010 

-0.6186135 X10“^ 


Table 2 .2 »4.1 
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For Renyi's msasure of directed divergence is not subadditive 
for 0 < a < 1 and is subadditive for all a > 1 


: P Q 


/ 0*15 
'‘0*2 5 


0 #35 \ 
0 * 25 “^ 


and R S 


/0*05 f 0*65'\ 
^ 0*15 , 0 . 15 ^* 


r 

a 

D (p Q;R-:5- s)-d„Cp/r) - D^(Q;s) 

U» CC (X 

0.150 

0.2 9 3341 7 Xlo“^ 

0.550 

0.9474033 xio~^ 

0*950 

0*1131955 X10“^ 

2.135 

0.1164209 xlo”^ 

2 *140 

“0.12 56293 X10“^ 

2*175 

i 

“0-1322637 X 10 “^ 


Table 2 *2 *4 *2 


For Renyi^s directed divergence of order a is not subadditive 

till a reaches a value a , 2*135 < < 2*140* For a > a * 

o' o — o 

it is subadditive* 


D 3 : P Q 


/0*175 

'“0*225 


0*32 5\ 
0*275^ 


and R S 


,0*075 

^0.125 


0*62 5^ 
0*175^ 


' a 

D^^p -x- q;r-x- s;-d^(p>r) “ D^(Q/Sl 

0*150 

0.2 556981 x 10“^* 

0.550 

0.53034 35 X 10“^ 

0,950 

0*1620397X 10“^ 

1 .01 5 

0*9306032 X 10 ”'^ 

1 .020 

“0,2774 503X10“^ 

1.02 5 

“0,1581162 Xlo“^ 

1.075 

0.1524230 X10”^ 


Table 2»2*4*3 
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We observe that for Renyi's measure of directed divergence 

is not subadditive for all a < where 1*015 < < 1*020 and 

o o 

is subadditive for all a > a • 

— o 


D . : P * Q 

4 


/ 0.02 

^ 0*27 


0*0Ss 

0-63 


and R «'S 


/0*03 0*06^ 

^0-30 0*61 


Our results show that Renyi's measure of directed divergence 
is not subadditive for this set of distributions for any value 
of (X- 


: P -it Q 


/0*2 5 0-2 5 \ 

^ 0-25 0 . 25 ^ 


and R S 


, 0-02 0 - 03 ) 

^0-33 0-07 


a 

Dq^(p q;r *s) - d^(p?r) - d^(q;s) 

0.150 

0.2016353 Xlo”^ 

0-550 

0.7489688 xlo“^ 

0-9 50 

0.9015147 xlo”*^ 

1.010 

0-862 8271 X lo"^ 

1 .510 

0-7976070 X lo”^ 

1.560 

-0.1083849 X lo”^ 

1.760 

-0.5797683 xlo“^ 


Table 2 *2 *4 -4 


For we find that Renyi's measure of directed divergence is 
not subadditive for a < ot^ where 1-510 < OG^ < 1*560 and is 
subadditive for all a > 

Kapur [ 19 ] has established that Dg.(P^Q) is a monotonically 
increasing function of a and also that it is a pseudo~convex 
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function a for O < a < 1 and a pseudc-concava fianction of a 
for a > 1* 


Now wo shall conclude our discussion of Renyi's measure 
of directed divergence with, our result that D^(P Q*R •55' s) 
satisfies property (6) only for 0 < a < !• 

Propocition 2»2*4#1 : f(X) = D{P;XQ+(i-X)P} , 0 < ^ < is a 
monotonically increasing function of XifO£a<l* 

Proof : We have from definition of D^(P/Q), 


f(X) 




In 


n 

S 

i=l 


t (i~ JOpj^}'^”’'^ and therefore 


1-a 


n .. 

S (l-a)p^{Xqj|_t{l~X)pjLi 

f'(X) = and (2*16) 

2 Pi{Xqi+(l*“X)p^}^ 

^ rr 1 -rr ^ ft -(a+i ) 

{-a(i-a)( 2 p.{Xq, + (i-?0p. ^) ( 2 p“( Xq, + (1-X)p, ) 

i=l 1 ^ ^ i 1 1 


X (qjL-p^)‘^-(l--a)^{ 2 p^(Xq^+(l-X)p^)“°^(q^-p^)}^} 

f"(X) = — 

a=r n 1,-rY 9 

{ 2 P.| (Xqi+(l“X)p. } 

i-1 ^ 


We can easily see that f'(o) = 0* For O < ^ < 1# have from 
(2-17) that f-^CX) > 0. Therefore for 0 < a < 1, f'(X) is 
increasing from zero as X increases from zero* That is 
f'(.X) > 0 for 0 < X < 1- But for a > 1^ we can not conclude 
any similar result- 
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2 #2 #5 Havrda-Charvat Measure of Directed Divergence : 


It is given by 

D (p/Q) =s { S “I} oc ^ 1, (X > O# (2*iS) 

Issi 


It can be easily verified that D*^{P^Q) is non-negative by 
mshi'ng use of (2*16) and convex w*r* to both P and q by making 

OC i ""*0C H 

use of the fact that 2 p, q. is convex for a > 1 and concave 

i=i ^ 


for a < 1 ♦ 


It can also be verified that it is a non— additive measure 
unlike Dq.(P;Q) and D(p;q)» Ws now consider its subadditivity 
property : 


Example 2. 2 -5.1 : 


and P = (0*1,0»9), 


Let -p^ Q 


( 0,02 

^ 0,21 


and R-X'S = 

0*63 


f0*03 0*06 \ 
^0*30 0*61 ' 


Q = (0-29^0*71), R = (0*09,0.91) and 


S = (0*33,0-67). 

( i ) a = 2 , then ws have 

D^(P ^Q;r ^ts) = 0*0136557/ D^(p;r) = 0.001221 and 
D^(Q;r) = 0*0072365 

Therefore (P Q;r -k- s) > D^(p;r) + D^(Q;s) 


(ii) a = j / then we have 

D^(P*Q;r*s) = 0-0033974/ D^(P/R) = 0.00029098 and 
D^(Q?S) = 0*0018715* 

Again we have (P Q;R'^'* s) > D^(pjR) + D^(Qjrs)» 
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Table 2. 2. 5*1 


: P * Q 


/O *2 
^ 0*2 


and R-)«-S 

U ♦o 


(0<.l 0^6) 

^ 0*1 0*2 


Then our calculations have shown thatfor all a < 1/ Havrda- 
Charvats' measure of directed divergence is not subadditive 
while for all a > 1 , it is subadditive* 


Table 2 *2.5*2 


I>2 : P Q 


/0*15 0*35\ 

^ 0.25 0 * 25 ^ 


and R * S 


/0.05 0*65\ 

^0*15 0*15 


Then for D 2 our calculations have shown that Havrda-Charvats' 
measure of directed divergence is not subadditive for any value 
of a- 


Table 2 *2. 5. 3 : 


D 3 I P * Q 


/0.175. 0*325x 
^0*225 0.275^ 


and R * S 


, 0*015 
^0-12 5 


0*62 5^ 
0.175^ 


a 

d‘^(p^{- q;r*s) - D®(pfR)“D°^(Q;s) 

0*100 

0*1668872 Xlo"*^ 

0.500 

0.4463673 xio"^ 

0*900 

0*1856760 xio"*^ 

1.100 

-0.1177340 xl0“^ 

2*000 

- 0.2 599993 X10"'^ 

5.600 

-0.1707166 X 10 “^ 

5*700 

0*2041808 xio"*^ 

7*500 

1.443590 

9.900 

16*32 3800 


Table 2. 2. 5*2 
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We observe that for a < a_ where 1#0 < a_ < is not 

subadditive then for a_ < a < where 5 #6 < < 5 ml, is 

O i JL 

s ubadditive and again for a > is not subadditive* 

Table 2 »2 mSmA s 


D. 


: P -;5- Q = 


/ 0*02 

^ 0*27 


0.03 -j 
0.63 


an d R ■'5' S 


, 0-^03 0.06 N 

^0.30 0-61 


Our calculations have shown that is not subadditive for 
any value of a. 


Table 2 .2 .5*5 * 


D, 


. -D 'i. n /0'.25 0.25\ ^ o /0.02 0*08 

; P^Q = ^ otr) ana R*S = („ ^ 


0.2 5 0-2 5 


0.83 0-07 


For Dg also our calculations have shown that D 
additive for any value of a. 


a 


) 

is not sub- 


Now we shall consider the properties of D^(P*Q) with 
respect to a. Kapur [i3] has established that d'^(P?Q) is a 
mono toni cal ly increasing with a and is a pseudo-convex function 
of a for 0 < a < 1 and pseudo— concave for a > 1- We shall now 
consider the verification of Property (6) of 2-2.2. 


Proposition 2 .2 .5.1 : f(K) = d'^{p;XQ+( 1-X)P} is an increasing 
function of for O < K < 1 * 


P roof ; We have f(X) “•qJZJ' ^ ^ p^(Xq^+(l-X)p^ 


i=l 

n 


f'(X) = ^ ^E^p5J(l-a)(Xq.t(l-.X)p^)-<^(q^.p^) 

( 2 . 20 ) 
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f"(X) 


la-i ) 


n 

S 

i=l 


pj ( (i - A)pi ) ^ ^ 


(2 • 21 ) 


From (2 *20) we can see that f'(o) = 0 and from (2*21) we get 
that f^'(X) > 0 for all values of a > O# 'Iherefore we conclude 
that f^(X) increases from 0 as X increases from 0 to 1* 


With that we conclude our discussion of the Havrda-Charvat 
measure of directed divergence# 

2 '»2 «6 Sharma and Guptas* Measures of Directed Divergence : 


These measures are givenby 


D^(p;Q;a^^) 

il 

^ a j 
s p^ q! 


(2,22 ) 


i=l ^ ■ 


D^(p;Qya,3,r) 


1 

J^(p“c4-“ - 

(2#23) 

'" 2 “ 


D^(P:Q7^,r) 

2 ^ 


sin (r In ~) 

(2.24) 

sin r ^tt^i 


Not all three measures are non-negative# Refer Kapur [24] 
for the following results : 

(i) For D^CP:Q/a^p) it can easily be verified that 


2“^D^(P;Qfa,^ ) + 2*^ D^(Q:Pya,3) = 0 

Therefore for every positive D^(P:Q;a•^) we have a negative 
D^(Q;P/a,^)# 

(ii) Let a. = If 0 ~ 2, r=2 then (2*23) gives 

P ^7 

d'^(P:Q; 1,2,2) = 2 r (pT - p. q. ) (2,25) 

i=i ^ ^ ^ 

Now if we let P = and Q = (2 #2 5)* we gat 
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(P:Q;l,2,2) = -0-083333 < O* 

Again if we take P = U and Q arbitrary in (2*2 5) we get 

P 0 ^ 

D (U:Q;l,2,2) = ^(l - S q. ) = O* 

” i=i ^ 

P 

So D (PoQjtcc,0,y ) can be negative and zero even if P Q* 

(iii) Let a = 1, 8 = 0, r = 4/ n = 2, P = (•5/*5) and Q = (•4#*' 
then from (2*24) we get 

D^(P:Q;i,0,4) = -0*07458 < 0 

Again if we take P = (*9,*l) and Q = (*1,*9) we get 
D^(P!Q/a/ 8 ^ t) = 0* Therefore D^(P:Qjra, 8 , r) can both 
be negative and vanish even if P / Q» 

We can easily sea that none of Sharma and Guptas' 
three measures is additive* 

Now consider the following exarrple : 

Example 2*2 *6.1 : Let P*Q = (^*3 and R S = (q*o| oIi 8 ^ 

P = ( 0 . 5 , 0 . 5)* Q = ( 0 . 4 , 0.6), R = ( 0 . 2 , 0*8) and S = (0*1, 0.9) 
anda = 8 =2* Then we have 

D^(P -:5- Q;R •»-S;2,2) = -0.0099777, 

D^(PsR; 2,2) = -0*0165904 and D^(Q:Sf2,2 ) = -0.0290038. 

Therefore we have D(P Q:R SJ2,2) > D^(P;Q?2,2) + D^(QiS|2,2)« 

With the same exartple if we take r = 1 we get a counter 

P 

example for the subadditivity of D (P:Qya,8,r) and if we take 
P = 2, r = 2we get a counter-example for the subadditivity 
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s 

of D (P;Q;/3 ,t)« So those examples assert that none of Sharma 
and Guptas' three measures of directed divergence is sub- 
additive. 

We now present the results of our numerical calculations 
for various distributions regarding the subadditivity of 
Sharma Gupta log measure of directed divergence* 

Table 2 *2 *6*1 i 
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In fact we have corresponding to every a lying between 
0.1 and 1*9 with 0*1 increments, values of ^ where changes 
from being not subadditive to being subadditive. Similarly 
we have values of a and jS for data 02,03,0^ and It is 

highly tedious to present all values hare. Therefore we 
present all the important information in the following table : 
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We shall now consider the properties of these meastires 
with respect to their parameters# 

Monotonous behaviour of D^(PtQ?a,e,Y) as a function of a 
Let f(a) = D^(P;Qj^a,3,x) • Then we have 

f'(a) s e ^ 2 ^i {log “ log p^^ log 

(2#26) 

We will now show with the help of an example that f' (a) can both 
be negative and positive* 

Example 2*2 *6*2 : Let a = 2^ ^ = 2, P = (0*5,0#5) and 
Q = (0*6,0»4)* Then we have from (2*26) 

f'(a) = 0*00129674 > 0* 

Again let a = 2, 3 = 2, P = (0.5,0*5) and Q = (0#55, 0*45). 

Then we have from (2.26) 

f'(a) = -8.8168 lo“^ < 0* 

This example shows that f' (a) can be positive and negative 
depending on the distributions, for the same a* Therefore we 
deduce that f (a) does not exhibit a monotonic behaviour 

Monotonous behaviour of D^(P;Qfa,g.Y) as a function of ^ : 

Let gO) = D^(PsQ?a»3,x). Then we have 

g'O) = 2'^ S pj 4 {log - §} (2.27) 

Example 2*2*63 ; Let a = 2, ^ = 2, P = (0*5,0-5) and 
Q = (0.55,0*45) in (2*27)* Then we have g' O )=* 4*8097 Xio ^ > O* 
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Again let a = 2, 8 = 2, P = (0.5^0. 5) and Q = (0#6,0*4) 
in (2*27)* Then we have g'(8) = -0*4333437 < O* 

This example illustrates that g(8) is not a monotonic function 
of 8 • 

P 

Monotonous behaviour of D (P;Q?a^8*T) as a function of a : 

P 

Let f^(ot.) = D (P:Q7a,8*'y) then we have 

fpa)= "(p“qP-“ln^) 

i=i ^ ^ % 

. (2'^-P.2’'-P)-2 in 2 2“-f s '') 

i=i ^ ^ 

( 2 * 23 ) 

Example 2*2*6*4 i Let a = 1*5, 8 = 0*5, r = 0, P = (0*5,0*5) 
and Q = (0*55,0*45) in (2*28)* 

Then we have ^(o^^ = 0*2 599682 > 0* 

Again let a = 1*5, 8 = 0»5, y = O, P = (0#2,0*8) and Q=:(0*5,0*5) 
in (2 •>23)* Than we have :^(<x) = -1*5666427 < O* 

This example shows that f(a) is not monotonic with respect to a* 

Monotonous behaviour of D^(PtQ*(X*8, y) as a functi^on of 8. • 

Let g;j^(8) = D^(P:Q?a,8/ 7)* Then we have 



(2*29) 
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Exanvpla 2#2#6#5 : Let a = 1,5, ^ * 0*5, r * 0, P = (0.2, 0#8) 

and Q = (0.5,0*5)# Then from (2.29) we have g^(3)=0.1147965 > 0. 

Again let a = 1.5, 3 = 0.5, r = 0, P = (0.5, 0.5) and Q=(0.6,0.4). 
Then from (2.29) we have g'O) = -0.0307757 < 0. 

P ' * 

Again we have that D (P:Q/a,3,y) is not monotonic w.r* to 

s 

Note that D (PiQfa,^,‘y) is symmetric in a and 7* • Therefore our 
conclusions regarding the monotonocity behaviour of it w.r. to a 
hold good for that w.r. to Y also. Therefore we conclude that 
Sharma and Guptas' power measure of directed divergence is not 
3 monotonic function of any of its three parameters. 

s 

Monotonous behaviour of D (P;Qfj3,y) as a function of P ; 

Let f2(3) = D®(P:Qf^,‘y) • Then we have 

f^Cp) = 2^ (sin y) S (Y In ^)ln (2q^). 

i=i ^ 

(2.29) 

Example 2.2 .6.6 : Let 3 = 1*5, y= 2, P = (0.5,0.5) and 
Q = (0#55,0.45). From (2.29) we get = -0.0109112 < O. 

Again let 0 = 1.5, = 2, P = (0.2, 0*8) and Q = (0*9, 0.1 ). 

From (2.29) we get f 2 (^) = 0*778836 > 0. 

s 

Mo notonous behaviour of D (P:Q/|B,Y) as a function of 7 : 

Let ~ Y) . Then we get 

fi -1 ^ ft P,’ 

g'(r) = 2^(sin Y) S Pi^^idn p^, sin ( Y In ^) 

+ cos( Y In -^) + ( Y In —)} (2.30) 

Sin ' qj^ 



70 


Example 2-2 #6.7 : Let y = i, /3 * 0.5, P = (o.5,0.5) and 
Q = (0.45/0.55). Then from (2 .30) we get g'Cy) = -0.095133 < 0* 

Again let y = 1/ 0 = 0*5, P = (o.2,0.8) and Q » (0.9/0.1). 
Then from (2.30) we get ("^ ) = 0.3666375 > 0. 

Therefore we conclude that D^(P:Qf3/y) is not monotonic 
with respect to either of its parameters. 

Concavity (or convexity) of D^(P:Q^a/P) as a function of a : 

From (2.26)/ f"(a) = 2"^ S pj^qf (log p. )'‘ log(~) 

i=l ^ ^ ^ % 

(2.31) 

E xample 2. 2. 6. 3 : Let oc = 1/ P — 0.2, P ~ (0.5, 0.5) and 
Q = (0.99/0.01). Then from (2.31 ) we get f''(a) = 0.1331302 > O. 

Again let a = 1, ^ = 0.2, P = (0.85/0.15) and Q = (0.5, 0.5). 

Then from (2.31) we get f"(a) = -0.6237022 < 0. 

So we have seen from the example that for the same set of 
parametric values we have two sets of probability distribution 
such that f"(a) is positive for one and negative for the other. 
Therefore we conclude that D^(P:Q/(X/^) is neither concave nor 
convex for all a. 

Hi 

We have similar exanples to show that D (PsQ;a/3) is 
not convex or concave w.r. to 

We are unable to conclude anything about concavity (or 

p . 

convexity) of the other two measures, viz. D (P:Qfci/^/y) and 
D®(P:Q;^/y) because of the highly complicated nature of their 
second derivatives with respect to their parameters. 
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Similarly we are not able to decide whether or not 

Ij s 

D (P:Q;a^p) and D (P:Qfl3, ) are convex w»r# to P and Q : whether 
or not they satisfy property no- (6)- 


But for 0<a<l, ^ >1 ora>l, 0<^ <1, a 4 

P 

We have D (P:Q;a/^, ) satisfying both convexity criterion and 
property (6). See Kapur [l4l# 


With that we conclude our discussion of Sharma and Guptas' 
measures of directed divergence* 


Ferrari' s Measure of Directed Divergence : 


It is given by 
1 

D (P:Q) = *■ 2 

^ i=l 


(1 




In (; 


1 + 


1 + 




X > 


0 


(2*32) 


It can be asily verified that D (P;Q) is a non-negative measure 
of directed divergence* It can also be verified that D (PtQ) 
is a convex function of both P and Q, by considering the 
functions f(x) = (1 + Ax) £ln(l + Xx) - A} and g(y) = 

B{ln B - ln(l + Xy)} and the fact that a finite sum of convex 
functions is convex# 


We can easily verify that D (P:Q) is not additive* We 
shall now give an example to show that D (P:Q) is not sub- 
additive* 


Example 2*2*141 : Let X = 1/ P = ( 0*5, 0*5)# 
R = (0*1,0*9) and S = (0*35,0*15) and let 


P-5fr Q 


/O *2 5 
^0*25 


0*2 5% 
0 . 25 ^ 


and R S = 


/ 0*02 

^ 0*83 


0.03% 

0 * 01 * 


Q = (0*5,0*5), 
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Then wa have D^(PifrQ:R*s) = 0.1547, D^(P:Q') = 0.0066 and 
D^(Q:S) = 0.0269. 

Therefore D^(P^.‘Q t R^s-s) > D^(P:R) + Dj_(Q;S). 


va 2 .2 .7^2 : f(X) = D. (P:Q) is monotonically decreasing 


for O < X < Xq and monotonically increasing for X > where X^ 
is a positive solution of the equation : 


n Pi - i n 1 t XPi 

i=:l 1 + Xq^ ^ i=i 1 + Xqi 

Proof : From the definition of D^(P:Q) we have 


(2-33) 


, n p. - q. 1 + xPj i 

f'(X) s i { S (-i ^ + p In i)} - i f(X) 

i=l 1 + Xqi It xqi ^ 

Ws can verify easily by applying L'Hospital's rule that 


+ p. In 


i)} - i f(X) 


(2.34) 


It f*(X) = - ^ S (p,. -q. < 0 

X-^0 ^ i=l ^ ^ 


(2.35) 


Now we shall consider the zeros of f^(X) for X> 0. From (2-34) 


we get 

n p. - q. 

f'(X) = O iff t (-± i 

i=si 1 + Xqi 


t Pi 


1 + XPj 

i + Xq^ 


. n 1 + Xp^ 

= f £ (1 + XP.) In ( 

^ i=l ^ i + Xqi 

which is equivalent to (2-33)* 


We note that Equation (2.33) can have more than one 
solutions all of which are positive. In that case the sign of 
f'(x) keeps oscillating between plus and minus as we go frOT 
one zero to the next one. That con^letes the proof of 
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Proposition 2 *2 *7 •2# 


D^(P:Q) satisfies the property no* (6) of 2*2*2, See [14]* 
With that we conclude our discussion of D^(P:Q)* 

2 *8 Kapur's Me asu re o f Directed Divergence of Order 0. and 

It is given by 




(2 *36 ) 


i=X 


This measure corresponds to Kapur-Aczel and Daroczy measure of 
entropy of order a and type 3, H^^^(p)* 

Kapur at al* [l5] have discussed the properties of this 
measure and shown that for 0<(1<1/ 3>l^or 0<3<i, flt>l \D^^p(P:Q) 

1 < a+^<2 j a+■^ > 2 J 

is non— negative, vanishes only for P = Q and is convex w*r* to 
both P and Q- We could not conclude anything regarding 
Property (6) for this measure* 

Rathie's measure with (n+l) parameters (P:Q) 

is essentially same as (2*36) but for the m parameters 0 ^ 

which were all equal in (2*36)* Therefore if 


0< a<l, 

1 < a+j3^ < 2 


¥ i ~ 1,2, » - ,n or 




> 2 


¥ i = 1, 


(O 


then D "(P:Q) is non-negative zero, only if P = Q and 


a 

convex w*r* 


to both P and Q* But the major drawbacks of this 
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measure are that, it is not symmetric w*r# to the probabilities, 
and we can not define additivity and subadditivity properties 
for it# 


2 #2 #9 Kapur^s Generalized Measures of Directed Diverpence : 


(i) 


D^(P = Q) 


n 

B Pi 
i=l ^ 


In 


% 


n 

B (l+ap. ) In 
i=l ^ 


i+ap . 

( li) 

'‘i+aq^^ 


a > -1 


(2*37) 

Dj^(P:Q) is a non-negative and pseudo-convex function 

of both P and Q# It vanishes iff P = Q* We can see easily 

n p, 

that as a increases 0 to “ D, , (P:Q) decreases from B p. In 

^ i=i ^ "^i 

to zero* It does not satisfy property (6)* It is neither 
additive nor subadditive# Consider the following example ; 


Example 2*2*9#1 : Let P = (0*1,0.9), Q = (0.3, 0.7), 

R = (0.45,0-55) and S = (0*5,0.5)* Let a = 1 and P-s«-Q = PQ 
and R S = RS. Th^ we have 


D 3 ^(P «-Q;R ^«-S) = 0.303043, D^(P:R) = 0*2 09360 and 

D^^(Q:S) = 0.055535# 

(p Q;R '^■S) > Dj^(P:R) + D^(Q;S) which verified the claim 
that D^(P;Q) is not a subadditive measure of directed divergence. 

The remaining measures of directed divergence due to 
Kapur are very generalized measures , constructed to satisfy 
Properties (i), (ii) and (iii). Their significance lies in 
the fact that they contain many Icnown measures as special cases. 
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Therefore here we only illustrate some special cases of each of 
these measures and refer the reader to Kapur [26] for further 
details on their properties : 


(ii) Dq(p:Q) : 


Case 1 ; It D^(PtQ) 

a-l ° 


n p 

r In = D(P:Q) . 

i=l 


Case 2 


Ca s e 3 


a = O# b = 0, c = 0, k = 1, Dq(P:Q) 
= 5^ = c“(P=Q> 


n 


1 (X ' 

a = 0^b = C# c = 0, k-*0 Dq(P:Q) In ^E^p^. q; 


a i— <x 
i^i 


= Dq^(P:Q) 

C ase 4 : a =1, b = 0, c = 0* k=si, D^(P;Q) 

n 


“ To^IT " n g, . ^ 

vv* a A -a 

1=1 


G 

). 


This is the error function obtained by Lubbe [22]* 


(iii) D^(P:Q) : 


1 rr^ 1 


C ase 1 : D^^jCP:Q) = ^ ^ ^ a>l,j>i or a<l,0<j<l 


i=l 


For i =1. D .(P:Q) reduces to Havrda— Charva 
' <x# j 


t measure* 


n 


CX jL 

C ase 2 ? Dj^(P:Q) = In 2 P^q:^ # ct ^ 1* 

i=i 

This is the well known Renyi's measure of directed 
divergence* 


m 
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Case 3 


0# j / j 


./(P:Q) 




where a > 1, j > i, ^ < 1, 0 < j' < 1 or a < 1,0 < j < 1/ 


^>l, j >1# Ifj = j'=l this measure reduces to Sharma 
and Tanejas* measure of directed divergence# 


,2 

4 I D^^gCP-.Q) = 


8 1-B 

s p^q7 

i=l ^ ^ 


- 1}/ a > 1, 


< i 


or ct < 1, 3 > 1, a / 6* 
( iv) D ^^^(PtQ) : 


X —1 

Case 1 : If we take W(x) = - , we get all the measures 

obtained as special cases for the measure D^(PsQ) as special 
cases here also see [26]# 


Case 2 : If we take ^p(x) = x In x, we get the limiting cases 
of special cases of D^(P:Q) as a 1, as special cases, here [26]* 

Case 3 : If we put ^f(x) = x In x - ^(l+ax)ln(l+ax) 

+ ~(l+a) ln(i+a) then we get a function of Kapur's measure 
of directed divergence [ 17,26]# 


Prom these generalized measures of directed divergence, 
Kapur has also obtained measures of entropy by making use of 


D(PsU) = H(U) - H(P) 

where U = (i, the uniform distribution. Now we conclude 

n 'n 

this section by presenting a table, measures of directed 




The following convention is followed in Table No* 2*2*10*1 

Yes ** the property is satisfied 

NO •“ the property is satisfied 

Yes/No the property is oonditionally satisfied 

•- - unknown* 


X 


Not applicable* 
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2#3 

2^3# 

(1) 

( 2 ) 

(3) 

(4) 

(5) 

(6) 


Measures of Inaccuracy 

List of Measures of Inaccuracy 

Kerrxdge [44jl 
, n 

l(P:Q) = -*• S p. log g. 

issi ^ ^ 

Nath [4S] 

I^(P:Q) = log g^“^, a / 1, a > 0 

Rathie and Kannappan [17] 

i“(PiQ) = f ^ -1). a/1, a>0 

Sharma and Gupta [59] 

Log measure 

^(P:Q) = - 2^ E p^g| log a > 0 / 0 > O 
^ i*“i 


Power measure 


i^^^(P:Q) = -z - y S a > o, y > 0, $ 

^ 2 *“2 i™l 


Sine measure 

I? _ (P:Q) = 

tt/P/ 


E sindog q^) ,a>0,^>0, T 4 O 


Kapur's measures [44#60] 

. ,-l -b/ ava, ^ i«-axb/ ^ a 1-axC 

Ii.(P:Q) = ((a-l)k) ^{n ( E P^) ^ 

JV •? —•15 *1 ?!!!• i 


i"*l 


i=l i=l i=l 

btc+ (a-i )lc, , -k (a-1 ) 


- ( S -I- (,s 


i=i 


i=L 


( 7 ) 
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(8) If a = b = c = o;k = 1 in (7), we get 


(J^p“ - 1) 

(9) Ifa='b=c = 0;^lc-+0, in(7) we get Lubbers {60 j 
mea sure 

n „ . n 


l‘^(P:Q) 


CT log { S Pi4*^/ E p^} » 
i=l i=l ^ 


(lo) lfa=:b=:c = 0/k = -I, we get a new measure 
I^(P-.Q) = i )+!^(l. 




i-a 


n 
S p 


■) 


a 


i=l i=l 

(11) Ifa = l, b=c = 0, k=-i, we get Lubbers [60] measure 

n 


““i 


E P 
i=l 


a 


■) 


^ a 1-a 
i=l ^ ^ 


2 ♦3.^2 Properties of Measures of Inaccuracy : 


Let P = (p^#p 2 # •••#Pn^ true probability 

distribution and let Q = be the asserted 

probability distribution* Let l(P:Q) be an inaccuracy measure* 
Then the following properties are verified for each of the 
measures listed in 2*3*1* 


I* The function I is continuous in p^ and for all i* 

II* When N equally likely outccxnes are stated to be equally 

likely then I is a monotonic increasing function of N* 
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IV4 

V* 

VI. 

VII# 

VIII. 

IX. 


If a statement is broken down into a number of subsidiary 
statements the inaccuracy of the original statement is 
a weighted sum of the inaccuracies of the subsidiary 
statements# For example, we should have 

I (a,p, *) = i (a,i-a; e,i-e)+ • 

The inaccuracy of a statement is unchanged if two 
alternatives about which the same assertion is made 
are combined. For exanple, 

I 9 ) = i(a,3+r;e,^). 

The quantity I is zero iff = 1 for some value of i 

and consequently, s O for all other values of i* 

’ l(P:Q) approaches infinity if q^^ = O and the corresponding 

Pi 0* 

The value of I is minimum for a fixed {p^J when 

for all i • This value of inaccuracy is the amount of 

uncertainty involved in the distribution {p^} # 

If variations of both p^ and q^ are considered the point 
1 

= * is a minimax point. 

If two sets of alternatives are asserted to have 
probabilities which are independent, the inaccuracy of 
the point assertion is the sum of the' seperate 
inaccuracies. 

Subadditivity ; l(P*Q:R*s) < I(P:R) iCQsS). 

We require that l(P:Q) is a convex function of Q. 


XI. 
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2»3#3 Renvi*s Measure of InacCT?T*?^^v : 

P* Nath [48] introduced this mea store# But due to 

its resemblance in form to Renyi's measure of information we 

call it Renyi's measure of inaccuracy# Let 

n 

P =s (Pj|_/P 2 # Pi ® E Pj^ = 1 be the true probability 

n 

distribution of an e:xpt# and let Q = •••/<^) / ^i > 0/ S % ^ 

i=i 

be the asserted probability distribution of the expt# Then 
Renyi's Inaccuracy is defined as 

I^(P:Q) = P'i^i"^^ (2#38) 

is a continuous function of and for all We shall 

now verify the property II in the following : 


Proposition 2*3#3*1 : When n equally likely outcomes are asserted 
to be equally likely then Ij^(P:Q) is a monotonically increasing 
function of n, n >1# 


Proof ; Let p = Q = U 



0*0 



# 


Then we have 


Iq^(U:U) 


-JL 

l-a 


In 


n 
( E 
i=i 


1 /I ) 
ii^n 


5s In n 

and we know that In n is a monotonically increasing function 


for n > !• 


I (p;Q) does not have a recursive prcperty# So now we 

cx* 

verify the property IV# 

Proposition 2»3#3#2 : ^ ^ 
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Proof : 


Ia(Pi#P2/P3!qi#q2#q2) = In (p^q^“^+P2qJ“^+P3q5"^) 



Proposition 2*3#3#3 : is zero if and only if p^ » q^^ = 1 for 

some value of i and Pj[_ qj_ = 0 ^11 other values# 

P roof : Sufficiency is obvious* We shall now prove the 
necessity part* 


Let 


1 

r-a 


n 

In ( r 
i=l 


Ct— 1 V 

Pi^i h 


= 0 


rr •i*1 

That means S p^q. = 1 
i=l ^ ^ 

n Q, , 

or P4(qj “ l) — 0 (2*39) 

i=l ^ ^ 

Now there are two possibilities* One is both P and Q are 
degenerate distributions with matching non-zero components and 
the other is that P and Q are any two distributions# Let us 
now consider the second possibility# 

Then we interpret (2.39) as the average of n nxombers 

which are all of the same sign (depending on a > 1 or < 1, they 
are negative or positive) with atleast two being non— zero# 

Hence the average cannot be zero# Therefore only the first 
possibility remains valid# 


i.e*# p^ = q^ = 1 for soma i and p^ = q^^ = 0 for all other i. 
Property VI is obviously not satisfied# For consider 
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Example 2 *3*3,1 : P = (1-^ 0, l^) and ^ 

Then we have Iq^(?:Q) = In 4 * 


Rema rk : In the definition of I^(P:Q) Nath [48] had overlooked 

the point that negative exponent can not be rised to power 
zero* Thar is his definition included all a > 0 except a = 1* 
Then we can not have any to be zero with a meaningful 
definition for I (P;Q)* Hence v;hat should be considered as 
valid range for the parameter values of a is a > 1* 

Property VTI is satisfied* Let V i = l,*..,n* 

Then ws have I^(P:P) = In ( S p. )/ which is equal to the 

CX X u» ^ 

uncertainty involved in P = ^Pi^l* this is not the mininium 

value of I~(p?Q)* We have obtained the minimum value of I^^Cp/Q) 

CX 

in the following 


Proposition 2 ■'3»3*4 : The minimum value of Iq^(P:Q) for a fixea 


P is attained when 

1 

X., 

„2-(X 


n 


E p 

i=l 


1/2 -a 


n 


1 


Min {I„(P:Q)} = In C S As a 1, 

Q a a A j. 


n 


Min {I„(P:Q)} - E In pj 

Q “ 1=1 

which is the minln«m value of Kerrldge's inaccuracy for a fixed P. 


Proof : 
of 


We shall note later that Io^(P:Q) is a convex function 
Using that fact and Lagrange multipliers method ws 
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shall obtain tha minimum value of I„(P:Q) 

CJu 


n 


Let L 5 


r=a 


=i 1=1 


n 


f\ T 

On equating -r;-- to zaro and solving for q. we get 

3% 1 


„ A 2 -a 
% = A Pi 


(2#40) 


n 


where the constant A is to be obtained frcm S p. = 1* We 

1=1 ^ 

obtain 


% = -TT 


^1/1^ 


1/2 -a 


E P 
i=l 

as p*d* which iflinimizas I|^(P:Q) for a fixed P» And 


(2 *41 ) 


n 


Min {I^(P:Q)} = In ( E p^ ) 
Q ^ a 1 i=l ^ 


(2*42 ) 


is obtained by substituting (2.41) in (2*38)* 


We now want to find lim Min {I^(P:Q)} 


a-»l Q 


lim In ( E • 

a-*l i=l 


n 


i/2~a 


i=l 

Using L^Hospital's irule, = lim £-<x 

a-1 


S Pi Pd 


n 


S P 


tin S p* 

1/2 -a i=l 




i=l 


n 


= - E Pt In Pi • 
i=l 

Property VII is obviously not satisfied by (2*38) because 
it can not be esqpressed as a sum of entropy and directed 
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dlv«=!rgencey besides the uncertain cy of the distribution P is 
not the minimurn of Iq,(P:Q)« Hence there is no question of 
having a minimax point for I„(P:Q)» 

Iq^(P:Q) is an additive measure of inaccuracy# In 
otherwords I|^(P:Q) satisfies the property IX# Proof is vi:ry 
simple as Iq^(P:Q) involves the logarithm function* I^^CPsQ) is 
not a subadditive measure# True to the expectations^ it obeys 
the subadditivity rule for a value of a for a sat of probability 
distributions and for another value, it disobeys the rule for 
the same set of probability distributions# We have noted our 
findings in the following 

E xample 2#3#3*2 : Let P ■K'Q = (*27 R*S = (* 3 q 

P = (.1,-9), Q = (#29, .71), R = (-9, .91) and S = (#33, .67). 

Then we have I„(P Q:R -iS- S)~I„(P:R)-I„(Q: s) = -0.0118943 for a = 

31 ;^ ( 3 . ^ “}*0#0i537 foxr oc =5 2 

Finally wc discuss the convexity property of I^(P:Q) w-r» to 
and in the following 

Proposition 2-3. 3.5 ; ^ convex 

function of q^^'s if a > 2. 

Proof : I^(P/Q) = In ^ 


— ^ 
3% 


a 


^i ^a-3 

— 5zr> % 


SPi<5i 


‘“-n (■ 


> 0 for a > 2 • 
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Note thpt when we are ch'-:ching I (P/Q) for its convexity as a 
function of wa have to treat as constants* That 

proves our clajm. 

Renyi's measure of inaccuracy is a continuous function 
of both tria>’ probabiliti .s anc the asserted probabilities* V/han 
^i'”% = i = I^(P?Q) is a monotonically increasing 

function of n* l^(p;0) is unchanged if two alternatives about 
which th^ same assertion is mad 2 are combined* These are thre-.- 
of the four axioms on which Kerri dga built his measure and 
proved tho uniqueness of it upto a multiplicative constant* 
I^^(PsQ) satisfying a majority of these axioms and several other 
properties that are satisfied by Kerridge’^s measure becomes 
almost a perfect measure of inaccuracy'* 


2*3*4 Rathi.o and Kanhe^pan Measure of Inaccuracy ^ 


Rathie and Kannappan [4?] defined a measure of inaccuracy 
which corresponds to I-Iavrda'-Charvats'' measure of information as 


follows 

I^(P:Q) 




r H l-r6 . , 


(2*43) 


It approaches ^i 

i=l 

multiple of Kr?rridge's measure* 


-i which is a constant 


I^(P:Q) is a continuous fiinction of both and which is 

fairly obvious- Now we shall verify property II in the following 
proposition* 



Pj^Qposi'tion 2*3»'Vi.i : I^(P:q) is a fnonotonically increasing 

function of n for | V i = . . .,n . 

Proof : I^(U:U) = ^ ^(|)*“^ - 1} 

2 i=,i ^ I’ 

= -«™ = f(n) sav/ than we have 

2 ^ -1 

f'(n) = * 0-i)n^“^, 

Case (i) : /3 > 1 then > 0 and hence f’^Cn) > O, 

C ase (ii): 3 < i then < 0 and again f' (n) > 0» 

I^(P:Q) does not satisfy the third property* But the IV 
property is satisfied* It is verified in the following 

A Q 

Proposition 2«3*4*2 ; I^(p^--fP2>P3?P.i»P2'*^% ^ % 

P roof : I^(pj^'/P 2 '/P 3 ^Pj^‘'/<l 2 *^'^ 


(2 


(2 




tPi 4 “® ^ P 2 ‘ 4 '’^ * P 3 ' 4 '’®* 

£P3,'4~^ + (P2+P3’<53"^* 


^ Pi * Pa "^3 * ^ 


Remaf]< : Rdnyi'^s measure of Inaccuracy can not be defined for 
P and Q both degenerate distribations-zbut with non-matching non- 
zero components. But Rathie-Kannappsns'' measure can be defined 
for these distribations also* However in that base 

I^(P:Q) -= I^(U'. !U. ) 7 ^ a, but- 

^ ^ ( 1 ^ 2 ^ ") 
prove Vth property in the' following 


with this in mind we new 



38 


P roposition 2. 3*4*3 ; * 0 if end only 

if Pj_ = =1 for some i s j and * o for all 1 =/ j# 

Proof : The sufficiency part of the proof is obvious# The 
necessity part of ic runs exactly like the proof of Proposition 

2#3#3»»3^ f oir 

D xt 

“/TTT ^ 2 p.q^ = 0 *-=> S p. (q^ ^-4) = O which .is same as 

i=l ^ ^ f 

(2 * 3 * 9 ) * 


Property VT is not satisfied# We have already discussed a 
situation similar to this property in Remark after the Prop#2*3»4*2 
Property VTI is also not sati'sfisd because I^{P:Q) cannot be 
expressed as a sum of information and directed divergence#. 

Q 

Wj th this wo now come to deriving the minimum valu 2 of l’^(P;Q) 
for a fixed P* 


P ropositio n 2»3 *a* 4 : The minimum value of i'^CPiQ) for a fixed 


P is attained when 


i ^ i /6 B 

and it is given by -*-4— '{ ( £ p.t * Moreover as 

2P“'-‘*--l is4 ^ 
n 

- .£ P^ P^ 

3 approaches 1/^ M±n l'^(P:Q) approaches 

Q 

. Proof : I^(P'iQ) 'is conlrsx with respect to q^"s for .0 >1# 

Maklna use of this fact and applying Lagrange''-s multipliers 

n 

method we minimize 1 '(P-:Q) subject to S q;. = i./ to get 


= 'n-^TTt 
.S Pi 
i=»l 


% 


2 

S Pj 

of X 


i=l 


(2*^4) 
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as the minimizing p.distn. The minimimi value of I^(PtQ) is 
obtained by substituting (2-44) in (2.43) as 


Min I^(P:Q) = I^(PsQ) = { ? P ' ^ 

i=l ” „1/^ 


Q 


.|3* 


*/3 

(-Ji_)^*'^ - U 

1 n 


r 

i=X ^ 


{ ( s - 11 . (2.45) 


Now we shall show that I^(P:Q) 2 p. In p.)/ln 

i=i ^ ^ 

lim l^(P:Q) = lira -jr-l — { (S - 1} 

0-*l P-1 i=l ^ 


■s —‘I i /a 

lim £- — S Pi 

P-1 ^ i=l ^ ^ 


n 


n 


+ ( S pK^)^ In ( S p^'^^)} 
i=i ^ i=l 


n 


= -( S Pi 111 Pi)/ln (2). 

i=l 

Let P/Q,R and S be any probability distributions such that P 
and Q and R and S are mutually independent# 


Then we have 
I^(PQ:RS) = ■ 


m n 


,P-i 


(2^’^-i) j=i i=l ^ ^ 


n 


m 


1 f ^ V^ S q-iS^~^ “ IJ 


-hV- { i: p.r- * s q-s-:. 

1=1 ^ ^ j-1 ^ ^ 

1 . ^ 8-1 , , . 1 r V ^ 


P-1 


1 ! 


- - air- ( S S q.sP“^-ii 

oP ^-1 i=l ^ ^ j=i 



: 3 ' \j 


= I^(P:R)-H^(Q:S)-(2^"’^-i)'“^ (P:R)l^(Q;S) (2#46) 

Fran (2^46) we can infer that I^(P;Q) is not additive* Therefore 

I^(P;Q) does not satisfy property IX for all values of 

I^(P;Q) is neither subadditive nor is superadditive monotonously* 
We deduce that in the following example* 


Example 2*3*4»1 : Consider the same distributions as ws did 
in Example 2*3 *3 *2. Then we get 

1^/2 (p* Q.R-:«. S)-I^'^^(P:R)-I^'^^(Q:S) = 1*1506211 - 1-2417572 

= -0*0911361 


and 


(P* Q:R* S)-l'^ (P:R)-I^ (QiS) = 2*9327869 - 2*0386115 

= 0*8941754 


8 1 

So the conclusion is that I*^(P:Q) is subadditive for a = ^ and 
superadditive for a = 2 for the distributions given in example 
(2-39) . 


We shall now consider convexity of I^(P:Q) w»r«to q^^s* We have 
from (2*33) 


^lf(P:Q) 

aqi 


3 1 



= TPrj and 

2 0 

== — p^(-l+3)^ > 0 3 > 0* 



Therefore we conclude that I^(p*Q) is a convex function of 

We conclude this subsection with a brief review of the 

properties possessed by Rathie— Kannappan measure of inaccuracy^ 

I^(P:Q)» It is a continuous faction of both true and asserted 

6 

probabilities* I (U;U) is an increasing function of n, the 
number of components of U* It does not satisfy the recursivity 
property but it is a convex fimction of It is neither 

additive nor is subadditive* It attains its minimum value for 

^ this minimum value tends to 
^ Pi 
n 

~( S p . In P^/ln 2) as ^ ejqjectedly* This minimum value . 

i=l "*■ 

zero iff = 1 for some i and = qj_ = 0 for all oth^r 

values of i* This measure deviates from Renyi's measure in onl 
one property, the additivity property* 


2 • 3 • 5 Sharma and Guptas^ Measures of Inaccuracy s 


Sharma and Gupta have proposed [593 the following measur 
of inaccuracy of two or more parameters* 

(a) Loq-Mea sure : 


i;:' 

a#p 

(b) Power-Measure : 


S Pi^li 109 ^ a > 0, ^ > O (2*47) 


i=i 


I^ o = S p?(q?--qr)a> r>0,|3 5^ r 

CX^ p / ^ ^ 

(2 »4B) 

(c) Sine-Measure : 

I® . (P:Q) * Y sin(y log q^)a>0,^>0, r =/ 0 

i=i ^ ^ (2.49) 
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^ ^'Oq~^easure 

From ths definition can easily note that q(P;Q) is 
a continuous function of and s» Therefore it satisfies 

the property !• 

Now we shall verify the second property in the following 

P roposition 2. 3.5.1 : fi(P:0) is a monotonically increasing 

1 

function of n for p. =q =~¥i=l^,..^n iff 

^ iJL AJ 

n > exp{l/(-a~3+l) } for a+^ < 1 etc. 

Proof ; VJe have 

o(U:U) = -2^ n In (i) 

cCjf p n n n 

= 2^ n^"^ ^ In n = f(n), say, thoi 

f^(n) = (l-a-e) In n-l} (2.50) 

For f'(n) > 0 we must have Ci-a-^lln n-i > 0 because n ^ is 
always > 0 and so is 2^* 

c ase (i) : Let < 1 then we have f'Cn) > 0 iff 

(i-a-p) In n-l > 0 
or In n > 

or n > exp{l/(l~<x*"P) } " 

Case (ii) : Let a+^ > 1 then we have f'(n) > 0 if and only if 

(i-a-p) In n-i > 0 


or (l-<x-P)ln n > 1 



or 


In n < 

n < exp{l/(l*-a-p) J . 


or 

We shall illustrate the situation in the following 

E xarttple 2.3 *5*1 : Let a = 0»5 and 3 * 1.5 then we take up the 
Case (ii) of Proposition 2.3»5#1 i.e*, we have 

n < expCl/d-a-^ ) } * ejjpC-i} = e”^ < 1 

which is impossible because we shall want n to be only a positive 
integer greater than or equal to 2 * 

Now consider a = 0^5 and ^ = 0*4 than we pertain to Case (i) of 
Proposition 2*3»5*1 i#a*» we have 

n > exptX/{i-^-i3)} = 

Now consider n * 10 lets say* Th^ we have from (2 *50) 
f^(n) = (l-*9)ln 10-1} 10''*^f-0*76} 

=s -0*12624 < 0 

Through this example we realise the fact that in any practical 
situation this measure may not always be a roonotoni'Cally 
increasing funeticn o£ n for Pj^ * ■* 1 * ljf«**#n. 

Property IV states that the Inaccuracy of a statement Is 
Unchanged if two alternatives about which the assertion is 
made are combined. Let us consider 

* ~2'^.{p“<4 ^ % ♦ J’3‘4 



= in 

^ I^^^(P2*P2+P3?%«‘12^ mless a = 1* 

So property IV is riot satisfied for l^^^CPtQ) \anless a = 1* 

Note that for a = 1 and ^ = o this measure happens to be Kerridge 
measure of inaccuracy and Kerridge* s measure satisfies property IV» 
We shall now consider property V which states that 
I(P:Q) should be zero iff p^^ = = i for some i and p^ = q^ = O 

for all other I* Obviously ^Cd. ?D. ) = 0 with the convention 

(X jp iP i i 

that 0 In 0 = 6 where is the degenerate distribution with i 
component being unity# We can also see that approaches 

infinity if for some q^ - 0 and p^^ 4 0» Now we shall discuss 
the minimum of this measure# Because r^^^CPiQ) is always 
positive and it is zero iff P « Q = for s<3ne i « i»####n 4 f we 
derive that the minimiim of this measure is zero’# This measure 
is not additive but ^ ©stressed as a 

weighted sum of 


n 


a 


m 
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This measure is sUbadditive according to its authors* It is not 
convex w.r. to See Kapur [24]. 

On considering this measure with an over all view/ we 
conclude that its positive aspects are very limited. It is always 
positive and is zero only for degenerate distributions with 
matching nol^-zero conponants. Its desired higl^oint might have 
been the subadditivity, but the property itself is not highly 
desirable for an inaccuracy measure. I^ ^(P:Q) does not satisfy 
any of the important properties like convexity, monotonically 
increasing of ^ I^ ^(UiU), etc. 

(b) Sharma and Guptas* Power and Sine Measure of Inaccuracy : 

•p . g 

One can easily see that I^ q(PsQ) and I„ « ^(PtQ) are 

oontinuous functions of only 

always positive, it can easily be verified 
take negative values also. Consider the following 

E xample 2. 3 ^5. 2 s ■ Let and 

P = (0.i*0.9) and □ * (0.6, 0*4)% Then we have 

if a .-v(PsQ) ® -0.80006 < 0. 

The amoxmt of inaccuracy must be always positive or at 
the least# zero. But it is difficult to give any kind of 
interpretation to inaccuracy being negative. Sharma and Guptas'^ 
power measure violates this most basic property* 

We shall now consider 



functions of n, f(n) and g(n) and derivs conditions for f' Cn) > 0 

and q‘ (n) > 0 » 

o -5 m o • -FCn) = n «(U:U) is a monoton i cal ly 
Proposition 2*3»5«2 . t>,n; _ 

increasing function of n if and only if 

(i) P < 1 _L. 

,i-a-r,T'^ (rf.4-R) s 1, n > 

(a) (a+3) < 1/ n > ^ ^ 


(ii) (3 > “^ 1 

r ^ r 1 n > , (d) (atp) > 1/ n > IrZaZ^^ 

(c) (a+3) < n > I 

Proof : From (4-3.2) we have 

p , . _1 

*to) = lcc,p, <u--u) = 


Then we have 


f' (n) = -- 


n 


-a 


{Cl-a-p)n'^ - Cl-«- (2-51) 


-cc. . always positive* Therefore 

In RHS of the above expression^ 
we consider the following two cases : 

, , « ^ 2 -^ > 2 -%r (2-^ - 2"^) > 0* Therefore 

Case (i) s 3 < r ==> 2 > ^ 

for f' (n) to be positive, we must have 

(l-a-|3)n'^ - (i-a-r)n"'^ > 0 

or (l-a-P)n"‘^ > (l-a-r)n 

« i (2*52) 

or (l-a-5)n"'^ > 


There are further two cases* 
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C ase 

or 

Case 

or 

or 

C ase 
for f 

or 

There 

Case 

have 

or 

or 


(a) : 1 > ==> > 0* Then (2*52) becomes 


for f'(n) > 0 


Oi > because (y-^) > o (2.53) 

(b) : 1 < a+3 ==> (i-a-p) < 0* Then (2*52) becomes 

for f'(n) > 0 

4-a-y^ 


1.. 


(2.54) 

0» Therefore 


n < because 0 < y 

X — U—/ 

(ii) ; 3 > y ”> 2^^ .< 2 "^ or ( 2 “^ - 2 “''^) < 

'(n) to be positive > frcsn (2 -•Si) we must have 

(l-a-^)n’‘^ - (l-d-y)!!*" '*'< 0 
(l-a--g)n'’^~^ < (l«<x*-t). 
are again two cases fiarther. 

(c) : (l-^-^) > 0. in this case for f*(n) > 0 we must 


r “6 fX T % 


n 




n > 




(, 13 -')) .> 0 


(.2*55) 
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Case (d) : (I'-a-g) < o» in this case for f'(n) > 0 we must have 




or 


„ . rl-a-r, r~3 

n < Cr=a=F^ 


because (~^-i-y) < 0 


(2 »56) 


Therefore we have conditions for f'(n) to be positive in (2. S3), 
(2.54), (2*55) and (2.56) * 

We can illustrate our conditions in the following exan^le* 

Example 2. 3. 5*3 : Let a = 0.5> ^ = 1, r = 2* Then ^ ^ and 

(l-a“0) < 0. Therefore case (b) appeals here. We must have 
for f'* (n ) >0 

1 - 

Now let us cohsider n = 2* Then we have 


f^ (2 ) = 2.828428 {—0*5 0*5 4 * -1*5 0*2 5} = 0.353535 > 0* 

Now let us consider n = 4* Then we have 

f^(4) = 2 {-0.5 0.25 + 1.‘5 0.0625} = -0.0626 < O* 

Therefore for the above set of values of can be equal 

to atrtiost 2 . 

2. 3 *5*4 Let a 0.4/, jS * 0.5/, r = 0*3* This is case (c) 

of Proposition 2*3>5i»2* For f'^(n') > 0 we must have 

1 1 

n > f 5 s f2j!4}*^*’^ a '0*0041* That i-s in this case n 

^■i^-oc-y* ^0*3^ 

can be any positive integer-* 
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Let n = 4 then we have 


f^(4) = -5.46 X {0.1 X 0.5 - 0.3 X 0.659 7 54} = 0.80767 > O. 

2. 3. 5*3 : g(n) 5 ^ ^(UtU) is a monotonically 

increasing function of n if and only if sin y and 


I y cofc> { 7* In n) + (l-cx—p) sin (y In n)} are both of the same sign. 


>3 


Proof : g(n) = 'rr r -- ; - {n* ^ ^ sin (r In n)} and on differentiating 


sin y 

w.r. to n, we get 


g' (n) 


n 


sxn y 


{'7' cos (y In n) •+ (1-a-^) sin (y In n)} 


(2*57) 


From (2-57) it is evident that the coidition is necessary and 
sufficient for g'Cn) to be greater than zero* 

Example i Let y = 5 » ct s o*5^ ^ * 0- 

Then we have sin y = sin y = 1 and now with n * 3 we have 

{y cos'( y In n) + (i*-a’*^) sin’( y In n)} = { - ^ ;x 0*1 542 8+0*5 x 0-49401} 

= 0'.0046’62 5 > 0 

there by giving us g'(n) > 0- 


Now with the Same set '6f values of we take n 6 to find 

that {y cos (y In n) + (i-a--j3) sin (yin n)} = -^1. 3270441 < O- 
Here sin y and £y cos (y In n)+(i*-^-T^)sih(y In n)} have 
differ^it signs. Hence g^ (n) < 0 and 'g(n) -is not increasing 


at n = 6 
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We can easily see from the periodic nature of o (P*Q) that 
for any set of values of r / we will have several values of 
n for which g (n) >0 and several others for which (n) < 0* 
Therefore it is not possible to obtain any condition on n so 
that g'(n) > O is satisfied. 


We shall now discuss other properties of these two 
measures of inaccuracy. 

P 

Neither satisfy the recursivity 

property. 

P s 

violate property iv. That 

is the inaccuracy (measured by „ «(PsQ) or if « „(P:Q)) of 

a statement changes when two outcomes regarding which, same 
assertion is made are combined. This highly desirable property 
is not satisfied by any of Sharma and Guptas' three measures* 

They are not additive and are not convex. We can not find the 
minimum value of any of these measures for fixed {p^} as they 
are not convex and hence Lagrange's method is inconclusive# 


-P 


But for XZ n .,(P:Q) the minimum is zero which is attained by 

th 


P = Q = where is the degenerate distribution with i 
component being unity. It is zero only for the degenerate 
distributions, if q(P:Q) takes negative values also as is seen 

Ou^ p 

in Exan^le 2. 3 .5. 2 and so its minimum value can not be obtained. 


2*3.6 Kapur's Measures of Inaccuracy : 

Kapur proposed a very generalized measure of inaccuracy [23] 


which contains many known measures of inaccuracy as special 
cases# It is defined as follows : 

Ij^(P:Q) = { S pj)®( S S 

i=l ^ 1=1 ^ 1=1 

1=1 ^ ^ 

-( S + ( S (2,58) 

1=1 ^ 1=1 ^ 

This measure is built to satisfy the following properties : 

(i) I(P:Q) > O 

(ii) I(P:Q) > I<PsP) = h(p) 

(iii) I(P;Q) = H(P) if and only if Q = P* 


But the special cases of this measure are of importance# We now 
present a list of special cases of (2 #58)# 

(i) lfa = b=sc = 0 and k = 1 and a - 1 we get Kerridge's 

n 

measure of inaccxiracy/ - E In 

i=l 

(ii) Ifa=»b = c=sO, k = l, we get another measure of 
inaccuracy due to Kapur [23] which is given by 



n 

( S p| - 1) 

(2 #59) 


(iii) Ifa = b = cO^ k**Owe get the measure which is obtained 

individually by Kapur and J»A» Van der Lubbe [60]^ 

^ a 1-a 


(2 #60) 
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(iv) Ifa— b — c = 0/ k = —i we get the following 
of inaccuracy 


new measure 


(PjQ) 

KN^ 


a-1 


Cl - 




n 


1-a 


> + f::a- - 


i=l 


i^i 


S p“ 
i=l ^ 


} 


(2.61) 

(v) If a = b = 0/ c = 0^ k = -1^ we get Lubbe's [60 ] measure 
of inaccuracy 


= =S^ fl - ( s ( 2 * 62 > 

i=i i=l 

Now we shall first undertake to study the properties of 
the measures defined in (2.59) ~ (2»62)* Later we shall consider 
the general measure (2.58). 


2. 3.6*1 Kapur^s Measure of Inaccuracy ; 


a-1 n 


It is obviously a continuous function of both Pj^^s and Now 

let us consider 

l“jCU=U) = Uj- (nl-“.n“-Ul) + ^ 

= H^(0> - + = tl-n““^)=£(n) (say) 


(2.63) 


a-2 


We then have f'(n) = n which is always greater than or equal 


to zero. Therefore we conclude that 
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P r op o s i ti on 2 *3*6*1 : When n equally likely outcomes are stated 
to be equally likely then 1^2 ® monotonically increasing 
function of n. 

We could not find any recursive relation for I^(P:Q)* 

Actually one can observe that the first part of this measure 

(2.59) is Havrda-Charvats^ directed divergence D^(.P zQ) and the 

and the second part is (n *) times Havrda-Charvats' infoOTation 
(X/ X 

measure H (.p;* As we had seen in previous sections/ no measure 

which has an exponent of the true probabilities in its expression 
, / ct \ 

viz* (*.p^..; satisfies property IV which combines two outcomes 
when they are asserted to have same probabilities* Hence 
1^2 also does not satisfy Property IV* 

1^2 = 0 iff P = Q for some i* It is very easy to 

see this fact* I^(P:Q) does not approach infinity if a 
qj^ = 0 and the corresponding p^ 0* Now we shall consider the 
minimum value of I^(P;Q) for fixed Since it can be 

expressed as a sum of an information measure and a directed 
divergence measure/ I^(P:Q) satisfies both properties VI and 
VII* Thus for a fixed {pj_}/ p^ - q^ gives a minimum of I^(P;Q) 
which given by 

a-1 ^ ft ^ 

Min{l“ (P:Q)} =s ( S P? “ which tends to - S pj^ In p 

Q K2 1-a X ±-i ^ ^ 

as a -► 1 . 

If the variations of both P and Q are considered p^ * q^/ 

V i = l/**./n is a minlraax point for I^(P:Q)* 
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Because Havrda-Chairvats' both measures (l.e*, of information 
and directed divergence) are non~additive, we can easily deducs 
that 1^2 is non-additive# Similarly 1 ^ 2 is not 

subadditive# 

1 ^ 2 ^^*^^ is convex w#r# to q^'s for a < 2 because sum of 
two convex functions is convex and Havrda-Charvat' s measures 
are convex for a < 2# 


2»3.6.2 Lubbers Measure of Inaccuracy : 


It is defined as 

n a i-a 
S Pi^i 

s P 4 
i=l ^ 




1 

a-1 


- f !! a i-a, 

in { s p.q. } 

iai ^ ^ 


+ t-Vi* 


(2.64) 


We observe from the above expression that Lubbe's measure of 
inaccuracy is a sum of Renyi^s measure of directed divergence 
Dj^(P:Q) and Renyi's measure of information Hq^(p). 

We can easily see that I^^CpsQ) is a continuous function of both 

Pj^^s and < 3 ^^s# Now let p = Q = u a i^ (4#4#7)* Then 

we have 

^ ln{n^““ n®*"^} 4- ^ ln{n^“^} 
s= In n which is an increasing function of n# Infact 


we could have directly stated that Ij^^(U;U) is an increasing fn# 
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of n from (2.64) as we know that D^(P:Q) = o if P » Q and 
H^(U) = In n* Property II is satisfied by l“^(P:Q). 


Properties III and IV are not satisfied by I® (PiQ) which 

■Li 

can be seen easily. 

CC 

= 0 ¥ i = but Ij^^(PiQ) does not approach 

infinity if = 0 and the cox'responding p^ = O unlike the 
Kerridge' s measure. It has a finite value. Now we shall find 
the minimum value of I^.,(P:Q). 

i-fi. 


Proposition 2. 3. 6. 2 s For a fixed {Pj_}/ minimum when 

qi = pji^ ¥ i = l^...yn. This value of the inaccuraoy is the 

amount of ^mcertainty involved in the probability distribution 

f , n 
^Pi^i=l' 

Proof ; Using the fact that I^^(P:Q) = D^(P';Q-) H^(p) from 

(2.64) and tnat D^(P:Q) is a convex function of we 

apply Lagrange's method to obtain the I^^(P^Q) minimizing 
distribution Q = {q^} as 


= Pj^ V i = 1> .. ./n 

and Min{l“^(P:Q)} = I^^(P:Q) = -lia 

If we consider the variations of both Cp^? and {q^}// the minimum 
value of I^^'CP'rQ) is given by P = Q - for any i = 

p^ = =r 4 V i = l,i.,.,n gives a minimax point for I^'(P':0) as 
for this^ the first term of (2.64) is zero end the second term 
attains its maximum value/ namely In n-. 
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Thus wa have seen that I^^(P:Q) satisfies Properties VII 
and VIII. Now we shall show that I^^(P:Q) is additive* It is 
because from (2,64) 

I®^(P:Q) = D^{P:Q) + H^(P) 

and both D^(P:Q) and H^(p) are additive functions of their 
arguments. However l“^(P;Q) is not subadditive, 
cx / - \ . 

is a convex function of Q-, This fact can easily 
bo verified if (2.64) and the convexity of D (P;Q) is considerod, 

(X 

for 0 < a < 2, Por a > 2 it is concave. 

We conclude this section by observing that I^^(P:Q) violates 
only the rccursivity property and the property that the 
inaccuracy of a statement should be infinity if an outcome is 
assarted to have zero probability whereas in reality it has a 
positive probability and that I®^(P:Q) is convex function of 
q.'s for only 0 < a < 2# Otherwise it satisfies all the other 
properties satisfied by the Kerridge's measure of inaccuracy 
and hence can be termed as a measure which is as good as Renyi^s 
measxire of inaccxoracy. 


2 . 3 ,6 ,3 Kapur^s New Measure of Inaccuracy 


It is defined as follows : 




{1 - 


" a 1-a 

X=1 


l-a 

J + hpT 


(2»65) 


Remark : We point out that this measure can not accept the set 
of distributions P = and 0=^2 where and are degenerate 
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distributions with 1st and 2nd componants being unity respectively# 
because the Ist term is not defined for them. Except for the 
degenerate distributions with non-matching non-zero components# 
is a continuous function of both p^'s and We 

have same drawback to Kerridge's and Rathie-Kannappans* measures 

UJL XiiOU'JUJL CtUy # 


P roposition 2 *3.6 #3 : I^(P:Q) is an increasing function of n 

if n equally likely outcomes are stated equally likely* 


Proof: We have lJ„(U;U) = £1 - 

- — KN a-1 


1 1 
1 - 7 4 > ^ {1 •• ^ 

1-a „a-i^ 1-a 

n n n 


1 

= = f(n) (say) then we have 


f^(n) =n^>0 


We have seen that Property II Is satisfied* Blit it is 
easy to see that I^j^(P:Q) does not satisfy Properties lit and Iv* 
Wo shall now find the minimum value of I®jj(PsQ) for fixed P* 

Prom ( 2 . 64 ) we can sea that for p^. = qj = 1 and p^ * = 0 

for i j# then both the terms ifi I^j^CPjQ) vanish and hence 
l“^(P:Q) = 0 . Therefore I^(P:Q) satisfies Property V. 

l^.APiQ) does not satisfy property VI* For consider the 

KN 

following 

E xample 2»3.6.1 : Let P = ^ and let 

a = Here q 2 = 0 whereas P 2 = | 0* Then we have 

li/2,(p.Q) _ 2 ,053 5029, which shows that property VI is not 

KN 

■Satisfied by I^(Ps'Q)* 
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Now we shall obtain the minimxOT value of I^j^(P:Q). We have 




a-i 


Cl ~ 


-} + 


n 


"" a i-a 

i-1 


fl - 


S P? 
i=i ^ 


} 


In this esipression both the terms are positive# We consider two 
cases 


Case (i) ; Let a > 1 then (a-i) > 0 and by Renyi^s inequality 

Kapur [ 20 ] we have 


” a l-a . 
E > 1 

i=:l ^ ^ 


or 


Q J.-a 
i=l 


> 1 


and hence 


a=T 


{1 - 


” a 1-a 
i=l ^ ^ 


-} > 0 


n 


and for a > 1, (1-a) <0, E p? < E = 1 =-> 


n 


i=l 


i=l 


^ a 
S PI 
i=l ^ 


> 1 and again 


I-a n a^ ^ 

E Pi 
i=l ^ 

Case (ii) : a < 1, the same inequalities may be obtained# 

Since (P:Q) is a sum of two positive terms, Min(P:Q) is also 
KN Q 

sum of the minimums of the two terras# But we know by its 

positivity the minimxm of the 1st term for a fixed {Pj^} zero, 

which is attained when q^ = Pj_ i = l,«*#,n* The 


second term 
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docs not involve Q and hence it remains as it is* Thus we 
conclude our results in 

Proposition 2 *3.6.4 : The minimum of (P:Q) for a fixed 
is attained when ¥ i = l,..*^n and is equal to 

1-a 


n 


1 -a 


{1 


n 




~ ^i ~ “ n know if this 

constitutes a minimax point as we don^t know if — jTlj®’ 


E P 
i=i 

l-a 


a 


maximuim for 


n 


l-a 


x-a 


Cl - 


n 


S P 
i=l 


a 


We are also unable to either prove or disproves I^j^(P;Q) is a 
convex function of q^'s. I^(P:Q) is neither additive nor is 
subadditive. We can easily give exairples proving our claim 
with this we ccmplete discussion of I^(P:Q) and its properties « 

2. 3. 6. 4 Van der Lubbe's Second Measure of Inaccuracy i 


It is defined as follows : 

.a 


1^2 (P:Q) = " 


S P4 




•a" 


( 2 . 66 ) 


We shall consider (say). 


— OG 


We then have (n) = n > 0# Therefore we have 

Proposition 2. 3.6. 5 : ® monoconically increasing 


function of n# 
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so the inaccuracy measure satisfies Properties I and II* 

However it can be easily seen that it does not satisfy properties 
III and IV, 


For Pj = qj = 1 for some j and p^ = = 0 for i j, =o 

Before going in to discuss any other properties, we would like 
to discuss the convexity property of 1 ^ 2 with respect to 

qj_^s. 

P gQposibion 2 *3 ,6 #6 : is convex with 

respect to q^^^s if only ifO< a< 1# 


Proof : We have 




S P 


a 


a l*-a 

S Piqi 


} 


Now consider 


, 0 < < 1 


(2*67) 


A+B+Cx^ 


then we have 


C X 


-a 




and 


~a 


--C^(A+B+Cx}~^) ax”^**^ - 2x““ C^{l-^)x 

f"{ X* ) = ’ 

^ (A+BtCx^^) ^ 

Now if we have (A+B+Cx^“^) > 0 then we get for 0 < a < 1 

f"i^) < O* 

n 

That is f(x,-) is concave- Then,n S fCx^) is also concave, but 
^ i=i 



ill 


n 


j f(x^) = Ij ^2 •••/Qjj) if we take C = 


OC"^ 


i=l 

1-1 

E 

k=l 


n 


and A =s s ® = E ^k^k convex* That completes 

k=l k=i—l 


1-a 


the proof because a+b+Cx^^ is always positive* 


Now we know that Lagrange's multipliers method gives the 

oc 

minimum of we shall apply it and find the minimum 

of IljCp-.Q) for fixed (p^^} . 

Proposition 2*3*6»7 : For fixed 1^2 1® minimum when 

= p^/ ¥ i = l^***,n/ and the minimum value is obtained as 


Min l“ (P:Q) 
Q 


a-1 


il - 


Pii 


for 0 < a < 1. 


Proof : Let 


n 


a 


L 


*ii> 


E Pi n 

- i .} - { E % - li 

^ a 1-a i=l 


E P^q^ 


On equating to zero and solving for q^^ we obtain 
9Q!4 


^ a 

E Pi 

4—1 ^ 

% ' Pi ^ n 


k 

, a 


. a l-a^i 
( E P.q4 ) 
i=l ^ ^ 


( 2 * 68 ) 


n 


where is to be eliminated fran the equation E q^ = !• We 

i=l 

get 


^Pi 


t ^ l-ttx 

(Sp^q^ ) 


Ti 


- i 

a 


( 2 * 69 ) 
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Therefore we have from (2*68) and (2*69) 

~ ^ i — l/*»»in» 

This distribution gives the minimum of 

only 0 < a < because 1^2 is convex only for 0 < (X £ 1* 

We can not conclude any thing for a > 1* By substituting 
<3i == PjL ^ i = l/*-*#n in (2*66) we get the minimum value of 

1 ^ 2 as has been stated* 

The minimum of this inaccuracy measure turned out be the 

Havrda-Charvats' measure of informationy albeit for f < a < i» 

1 

Therefore we know that, for 0 < a < 1, = q^ = * gives a 

minimax point for , because from proposition 2 *3*6*7 

we have pj_ = q^^ giving us the minimum of 1^2 “ n 

j^l“(X . 

giving us the maximum of this minimum/ — JTZa” ' which tends to 
In n as a approaches 1* So this measure satisfies Properties 
V, VII and VIII the last only for 0 < a < 1* 

1^2 not satisfy property VI as it has a finite 

value even if P^^/O for some i whereas qj_ = 0. But if this 
is true for every i = i/**»,n/ then 1 ^ 2 ^^*^^ tends to infinity* 
Therefore we say that 1^2 satisfies Property VI in only 

a particular case, but not in general* 

One can easily verify that this measure is neither additive nor 
subadditive* 

We had already discussed the convexity of 1^2^^*^^* 



In ths following table the following notation as followed 


Yes - the property is satisfied 

No - the property is not satisfied 

Yes/No - the property is conditionally satisfied 

- Unknown* 

Measures of Inaccuracy and their Properties : 


L Properties 

12 3456 739 10 U 

Measures ' 


1 

Yes 

Yes 

Yes 

Yes 

Yes 

Yes 



Yes 

Yes 

Yes 

2 

Yes 

Yes 

No 

m 


No 


r 

No 

Yes 

No 

Yes 

3 

Yes 

Yes 

No 

Yes 

Yes 

No 




No 

Yes 

4 

IB 

1 

Yes/No 

No 

No 

Yes 

, 

Yes 



1 

No 

Yes 

No 

5 



'H 

No 

No 

Yes 






Yes 

No 

6 

Yes 

Yes/No 

No 

'No 


** 

No 




No 

7 

Vc 

^ry Generalized Measure. We consider only special cases. 

8 

Yes 

Yes 

No 

No 



Yes 

Yes 

No 

No 

BBSIHii 

9 

Yes 

Yes 

No 

No 

r 

Yes 

NO 

Yes 

Yes 
- 

Yes 



No i 


Yes /No 

^ * - 

10 

Yes 

/No 

Yea 

No 

No : 

B 

— 

No 

Yes 

m 

No 

No 


11 

Yes 

Y0S 

h — — ^ 

No 

No 

■ 

■ 


Yes 

Yes 

No 

No 

Yes/No 


Table 2.3.7 


with that we conclude Chapter 2 



































































Chapter 3 


I ndependence Inequality and Subaddl t ivi ty for Mea sur es of 

Directed Divergence 

3*1 Introduction ; A«b* El— Sayeed had introduced in 1977 [5 ] 
an inequality for measures of entropy which he called 
Independence inequality# 

Let P-»'Q denote a joint probability distribution with 
P and Q being the marginal prob* distn# and let H(.) denote 
any entropy measure# Then the inequality 

H(P-5frQ) < H(PQ) (3#1) 

where PQ is the product distribution of P and Q is called the 
independence inequality, and the entropy measure H(.) is said 
to be satisfy the Independence Inequality (hence forth denoted 
by 'the I*I#') for that particular set of prob# distn# 

Here we note that the I*I# is very much a property of 
the distributions as well as that of measures of entropy# 

El-Sayeed had obtained several types of probability 

(X « « 

d istiributioxis for which ths I#I# is satis fxsd by and 

[for definitions of these measures of entropy, refer 
Chap# 2] . He had also obtained a sen of distributions for which 
none of the above stated measures of entropy satisfy the 1*1# 

The significance of the l*I# lies m that result of 
U-Sayeed which states that if a measure of entropy is additive 
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and satisfies the than it is subadditive, for that Joint 

probability distributions* For the definitions of additivity 
and subadditivity, refer to C3iap. 2* 

In this chapter ws generalize the I*Io to the measures 
of directed divergence and obtain various prob* distn# for 
which and satisfy the I#I# We also obtain a 

prob* distn* for which the l*l* is satisfied by none of these 
measures of directed divergence* We do all this in the following 
section# 

3 #2 The l#Ii> for measures of Directed Divergence : 

Let P*Q and R-k-S be any two joint prob* distns# with 
P,Q and R, S being their marginal prok* distns* respectively# 

Let P Q = j,k = l,...,n and let R S = (w^.^)j,k = l,...,n. 

Let PQ and RS denote the product distributions of P and Q and R 
and S respectively* Now if D{#/*) is any directed divergence 
measure, we define the following ; 

S ub Additivity : D(P*Q/R-”-s) is said to be subadditive for P»3 
a nd R^ if 

D(P-x-Q : R*S) < D(P;R) + D(Q:S) (3*2) 

and if the inequality (3*2) is satisfied for all P Q and R S 
then D is said to be subadditive * 

A dditivity : D(pQ:RS) is said to be additive for PQ and RS if 


D(PQ:RS) = D(P:R) + D(QiS) 


(3*3) 
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and if the equality (3*3) holds for all independent prob- 
distnsj. then D is said to be additive # 

I ndependence Inequality : If D(P*Q;R S) satisfies 

R(P^fQ?R^-S) < D(PQ:RS) (3.4) 

then D is said to satisfy the for P Q and R S. If the 

inequality (3#4) is satisfied for all joint prob. distns# P Q 
and R S then D is said to satisfy the I.I. 

Only the Kullback-Leibler measure of directed diyergence 
satisfies all the three concepts defined above# The Renyi 
Havrda-Charvat (d“) Kapur Sharma and Taneja (D ' ) 

satisfy the I#I* for certain distributions and examples are 
given in this chapter to show that they do not satisfy it for 
others# 

But to begin with we establish the connection between 
the three concepts defined above*. 

Th eorem 3.2 #1 : (i) If D is additive and satisfies the I»I* 
for a set of distributions then it is subadditive for the set 
of distributions* 

(ii) If D is subadditive and additive then it 
satisfies the I#I# 

Proof : (i) From (3#3) we have D(PQ:RS) = d(P:R)+D(Q:S) 

and from (3*4) 


D(P*Q:R-»-S) < D(PQ:RS)# 
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therefore we get D(PifQ:R*s) < D(P:R) + D(Q:S) which is what is 
to be shown. 

(ii) Fran (3*2) and (3#3) we have D(P Q:R s) < D(P:R)+d(Q:S) 

= D(pQ:RS) which means that D satisfies the I.I. for P-s^Q and R*S. 


We can verify our results in the following example* 

E xample 3»2*1 : Let d(P:Q) = D°^(P:Q)* Also let P be any 
arbitrary distribution and let Q,R and S be uniform distributions* 
P*Q is arbitrary but we choose R-k-S = RS* At a later stage 
[Preposition 3*2*1 ] we shall show that indeed satisfies the 
1*1* for this set of distributions* Here we shall establish 
that is both additive and subadditive for this set of 
di stributions # 

Consider the following relation (c*f* Chap* 2) for 
d“^(PQ:RS) : 


d‘^(PQ:RS) = D°'’(PiR) + o'^CQtS) t (a-1) D^'CPsR) D°''(Q:S) 

(3#5) 

CX/ \ 

Because Q =s S, we have D (Q:S; = 0# 

Therefore is additive for PQ and RS# 

Now we shall show that d“ is subadditive for P^«Q and 
R-5frS* For, consider 


D^^'CP-JfrQtR-H-s) = 


a , 1 %a-l 


2 -1 


n 


1 r 1 ct 3(l-<x) 


n 

i] 
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,i-a 


-1 




3(1 -a) 


E 

j 


k ^ 




a 


-ii 


[ by convexity of x -* x^ for cx > 1 

i mm(Y 

and 2 -1 < O# If 0 < a < 1, 

(X . 1 *»cx 

X -► X IS concave and 2 -1 > 0 » 

So in both cases the inequality is 

same* ] 


1 

,i “CX 


'“1 


r 2“3 a, xtt-l 
[n S .) 

j j j 


- 1 ] 


< 




a-1 


r . 


3 


i] 


< 1 for a < 1 
> 1 for a > 1 


We have d“’(P*Q:R*s) < d“(P:R) + D°‘'(QsS). 

Therefore we get that D^(P R:R S) is svibadditive# We can easily 
construct exanples such that a directed divergence measure 
satisfies only the I#I* and does not satisfy the other two 
properties* 

L emma 3*2*1 : Let cx e R and a / 1# Then, for a given set of 
prob'. distns* the I.I#is ei-ther satisfied by both and 
or is satisfied by neither* 

P roof : D^(P:Q) = (E p'J q^.““ -l), = (a-1)”^ (3.6) 

and Dq,(P:Q) = log <3^”^ *■ 1^» (3.7) 

The I#I- for says s 


C E 


CX 1-a 
jk 


1 ) < 


( S - i) 
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( S >i?v W < ( S (Pj<3k)“ 


j/k 


jk jk 


j,k 


-j-k' 


(3.8) 


And for D^, the I.l# is 


jU. log 


w, cc i-a / / xa 

j^k "jk ^jk ^ 


(r.Sk)^-^) 


(3.9) 


Now it is clear that (3.8) and (3.9) are equivalent, because 
X •* log X is an increasing function. That completes the proof 
of Lamma 3.2.1. 


So. in what follows it is enough to prove or disprove 
the 1*1. for either or D and the proof or the otherwise of 

Cu 

the other follows immediately. 


P roposition 3.2.1 t There exist a, 3 and prob# distns. such that 
the entropies D^.D ,D „ and do not satisfy the I.I. 

Qf (Xf ^ jlj 

( For the sake of convinience. we give definitions of these 
entropies again : Here we consider their normalized forms 


(i) d“(P:0) = 

(ii) D^(P;Q) = 

(iii) D^^^(P:Q) 


i r„ a a-l . 


1 , „ a a-l 

^1=0^ f '’J 


a 5 ^ 1 

a ^ 1 

CL ^ 0 f 


(3.10) 

(3.11) 


0 < a < 1,0 > 1 
or 0<P<l.a>l 


(3.12) 


(iv) D°^^^(P:Q) = ^ ? P j 

^ ^ (3.13)) 



Proof ! (1) Lot a = 2, P«Q = and R»s = O.J, 

Then we get P = io»5,0,5)^ Q = (0.45,0#55), R = (0,3^0»7) and 
S = (0*5, 0.5)* 


= 'S:22l o:2?5> o:'!’ 


Now we have 


^ 2 Ti^v - 1) = 1.8785 and 


j/k 


jk Jk 


(■ S = 1.87375 which show^s that 

(andhence by Lemma 3.2.1, D^,) does not satisfy the I.I. 

(iii) Let a = 2/ 0 = 0.5 and P'^^Q and R-^^-S be as in (i). 
we get 


S n 

-L- loo 
^-a 


a+3-1 -l+a 




jk 


S TI^v 
j/k 


= 0.8094736 

il+a 


.2 

^ log 7 :~ ' P ^ " 0.30129. 


J.k 


which proves that ^ does not satisfy the I.I. for the 
sat of distributions. 

(iv) We have for a = 2, 0 = 0.5, 


- . ^ - V jit = 2.9164887 and 

•atg jj^k 


1 r „ CG 

^ ^ £ E 71 


Then 


given 
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So we have been able to find a set of distributions for 
which none of the four directed divergence measures D^,D 

GC Oi ^ P 

and satisfies the I*I* 

In the following Example, we construct a set of 
disoributions for which and hence D satisfy the I#I» 

(X 

E xample 3»2»2 : Let P Q and R S be defined by the conditional 

probability matrices : 


^ik 


. P P 

^“P / n=I 

P , P 

1-P /•••/ ^ 


* 

« 


P P 

’-n^T ' n=r 


and s^^ = 


•n n 


1 -£. 
1~P/ H^'***' 


P P 1 .. I 


n n 


( 3 * 14 ) 


where 0 


<p<l, p^_|¥i = l,..,,n, R = (r^, , 


= ( 1“^, 0, 0, • • * , 0,p ) • 

Then we have 


^jk “ ^j^jk n ‘^jk^ jk ^j®k 


._^2 p(i-p) p(l“p) 

(1-p) . 

OfOfO^ 0 

0 jp 0 0|r # # f ^ 0 

I 2 2 I 

L /•*•/ p^^~p^ Jp 


n 


Now ? "jk = n Ijk “ H J = 


and finally we have S = (s. #3,, •***J^' • 



122 


We obtain after making tha calculations 

J /K 

+ (n~i)^^~“°^^ (3.15) 

and 

,r (pjq^)“(rjS„)“-i =n-2“ [(i-p)“-V^][(l-p)“‘^+(n-i)*-“p“-i] 

j /k 

(3.16) 

(l-p)^®“^ + (n-l)^^^““^p^'^"^ 

Let f(p) « — — — ■' — -” -- j"— - then we have (3*17) 

(l-p)""-^ + (n-i)^”^ p““^ 

f(0) = 1 and £(1) = 

[ (l“p)^ ^+(n-i )^’^p*^ ^ ] [-{2ct-l ) (l-p 

t (n-l)2<--«U2a.l)p2«~2]- [ (i-p)2<=^“^+{n-l )2 ] 

^ [ -(a-i ) (i-p)^'”^ + (n-i)^~^(a-l)p^**^ ] 

f ' (p) a ^ .a-1 ^ f 7'^“°^ a-1 

[(1-p) + (n-1; p J 

(3*19) 

And f'(p) 0 <=«> p = « (^) and £{p^) = (|)®. (3.20) 

From (3»18) and (3.20) we can sea that 

0 < (|)“ = £(Po) < (3.11) 

•** The point p = p^ is a minimum £or £(p) i*s*, we have 
(3*17) £(p) < = 1 

or C(l-p)2“-^ + < n-“ [(l-p)‘‘-^+(n-l)2-%‘=‘-i ]] 



Now > O, we get the I.I. satisfied for 

both and D^# This proof holds for all values of a, d 1» 

In the next proposition, we shall consider the I«l#. 
for more general distributions than the ones considered in 
Example 3#2»2* 


Proposition 3#2«2 : For arbitrary P-mQ with Q s= U and R = S = U, 

where U is the uniform distributions with n componentsand 

R*S = the following directed divergences satisfy the l*I« 

(i) a > 0, d ^ I 

( xi) ^ ^ Ct 5^ 1 

(iii) , a+3-i > i, 3 < Kor a+3-i < i, 3 > 1) 

(iv) d > Ij, 3 < 1 or a < 1, ^ > !• 

P roof : (i) Let 0 < a < !• Then we get IX = (2'*’ -i) >0* 

d 

We also have by concavity of x x , 


^4 V ' ® •? V'' — n -i-i 5-j 




n ^jk 


< n vS r 


n ^jk' 


r,2(l-a) ^ 


_ a , , ,a-l ^ 2Cl-a) 1-a 

or s ^ " “ 


^2 (1-a) E g^¥ j=l,2,,..,n 

(3*22 ) 
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Now multiplying both sides of (3 #22) by p^ and sianming over 
we get 


3*^ 


(3.23) 


Fran (3*23) and because ju > 0 we get 


S (pjqj^)“(rjSj^)“-^ < E Cpjq^)“(rjsp“-^ (3.24) 

J 3»^ 

(3.24) irt^lies the I«I. for D*^. 

(b) a > !♦ Now jUr < 0 and x *«■ x*^ is convex# so the inequality 

in (3.23) is reversed, but jU < 0, therefore we again obtain (3 •24). 

(ii) Using (i) and Lemma 3.2*1 we get the 1. 1* fo^ 

*,t j (X+p —1 

(iii) Let 3 < 1, at^-l >1. Now jLt = O-a) ^ < O and x x 

is convex. Therefore, we get by proceeding in the same way as 

we did in (i) above, 

E (n )"P-hw )-l« > r (Pjqp“*^-hr (3.24) 

,k j,Jc 


J 

and 


2 

j,k J"" 


< 2 
j,k 


( 3 .2 5) 


Fran (3.24) and (3.25) 


,cx+^-l /.. ^-l+a 


2 (71,.^)'*^'^-" (w_) 

log JliJI — > log 




E (Hj^) 






2 

j , k 


0 


(3.26) 
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and because ju < 0, 




log { 


j f'k. 




jk' 


E (n ., ) 

J/k 


F 


-} < log s } 

(3#27) 


The lest inequality is what we intended to prove* 

Now if we consider P > 1 and ct+0-l <1# then jll > 0^ x -• is 

concave* Therefore the inequality (3*26) is reversed and once 
again we obtain (3*27) • 


(iv) D^'^(P;Q) 


1 r ^ - 

(-00+3) ^ j 


J 


Now consider a > 1, 3 < i* Then ju, = (3-oo)*’^ < 0* x -► is 
convex and x -• x^ is concave# Therefore we obtain the following 
inequalities s 


E (ti > E 

Uk ^ ^ ^ ^ 


a-1 




,a-l 


j 


Now from (3*28) and (3*29) we get 


( E (7I.t,)“(w,, E > 


^r., •nP-I 


j,k 


^jk' ^'™jk 


j,k 


jk" '"jk' 


(3-28) 

(3.29) 


/ / %C0/ \O0-l 

( E (p-i^ic^ 
j,k 


j,k 




(3*30) 


But fi < 0, thereby 
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j /k 


.)^(w. 


j A 


jk' '"jk 


)^-i) 



r J S3^) ^•■^- . S C p j q^^) ^ ( r ^ s,^) ^ ) 

J 


(3*31) 


(3*31) is tha 1. 1* for If we consider a<l, 6>1, jLi>0, 

inequality in (3*30) is reversed giving (3*3i) again* That 
completes the proof proposition 3*2*2* 


R emark : We can easily verify from the proof of Preposition 3*2*2 
that the following directed divergences do not satisfy the 1*1* 
for the same set of distributions as in the proposition : 


(i) 

a 

< 0 



( ii) 


S. ^ 



(iii) 


at^-l >1, 

0 

< 1 or at3~i > 1, 

(iv) 


a > 1, 0 > 

1 

or a < 0 <1* 


P roposition 3*2*3 : If the rows of the conditional probability 

matrix s^te permutations of the same n— nximbers and R'**S, R 

and S as in Proposition 3*2*2* the following directed divergences 
satisfy the I*I* : 

(i) a > Of a ^ 1 

(ii) a > 0, a 1 

(iii) D„ Q, a+0-1 > 1, ^ < 1 or a+3-1 < 1, > 1» 

a.jrp 

(iv) a > 1, ^ < 1 or 'dt < 1, P > i* 
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££22f ! Let such set of n numbers be ( 32 /■*■»», a^) and 


n 


a 


let E a. = A* Then 


i=l 


E = S a!^ n*^'* 


>a-i 


k 




a „2(l~a) „2(i-a) 


(3.32 ) 


E E_ Pj = E Pj A (3.33) 


j 


From (3.32) and (3*33) we qet 


cx , xtt-l a / ^ 

E q.v(r.s,, ) = S E P,q,v(r.s,, ) 


a-1 


^jk'^j jk 


k j 


j^jk'^j-'jk' 


(3.34) 


Now ifO<a<l,EE p .q*^, (r.s 

kj 


< S (E p.q., j^2(i-a) ^ 2 qj (3-35) 


k j 


j^jk 


From (3*34) and (3-35) we get 

S q?^(rjS „ )^*”^ < E) q^ / j=l,2^...,n. (3-36) 

k k 

Now multiplying with p^ and summing over j, we get from (3-36) 

,a-X 




(3.37) 


/ X*"(x - \ "“jL 

For 0<a<i,M=(2 -1) >0 


M( E^(pjqj3,)''(rj.Sj3^)°'“^-l) < M(.yPj.qT,)^rjS3^)“* -1) 

(3.33) 
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which is the I»I# for D (and hence for * We can prove the 
resxilt in the same way for a > 1# And by taking the values of 
a and ^ in the range prescribed, we can easily obtain the result 
both for snd That completes the proof of Proposition 

3 #2 #3 » 


R emark : The following directed divergences do not satisfy the 
1*1# for distributions as in Proposition 3*2*3 : 


( i ) D^, a £ 0 

( ii) D^, 0^ < 0 

(iii) a+3-1 > 1, ^ > 1 or a+^-i < 1, 3 < 1 

(iv) a>l, 3>iora<l, ^<1* 


P roposition 3»2»4 : Let a > 0 and a ^ 1* If the eloiients of 
the conditional probability matrix (q-v^ satisfy the equation 
S q = A for all j = l,2,*«*,n where A is some ccxistant and R S, 

V 

jv ^ 

R and S as in Proposition 3^2*2 then D and satisfy the I*I» 

Proof : Note that in the proof of Proposition 3*2*3, we used the 
row permutation property of (qj;!^^ ini obtaining the equation : 


S = An^^^““\ a constant (3.39) 

J J 3 

Therefore all the proof of (i) and (ii) of Proposition 
3-2-3 ranains valid for all distributions satisfying (3*39), not 
necessarily the row permutation property of . That conpletes 


the proof of Proposition 3 *2 *4- 
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I (i ) The I •I# is satisfied by the directed divergences 
noted in Proposition 3#2*2 if P = U, where U is the uniform 
distribution and R S, R and S exactly as in Proposition 3*2*2# 


P roof I L;= S 
J/k 




{X«*l 


= s (-^) 

J/k ^ ? 


1 ^a-l 


= „2(l-a) (Ua ^ ,^ 0 . 

“ j.k (3.40) 


r. „ / Ntt-l 2(i-a) /-lx a 

j,k 


a 

“ ^n' " I "^k 


= n2'i-“’ci)“ n jCl pjqj^) 


a 


k j 


= ^2(l-a) (ijct jtj ja 


n 


k j 


Jk' 


(3*41) 


a 


Now if we take 0 < a < i, we have by concavity of x -* x , 




(£ t q 

n ^jk 


1 *“0C /■ v* A". \ ^ 

n (S qj^) 


2 

J#k 


q. 


a 


jk 


< n 




E(S qj„)“. 


(3*42) 


From (3*40)# (3*4l) and (3-42) we get L < R/ and because 
M = (2^“^-l)’'^ > 0 the 1*1- is proved and D^* 

For a > 1, we get L > R and M < O, thereby proving the 1*1* for 
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and again# For D^, ^ and the proof is similar to that 

of Preposition 3*2*2# 

Corollary 3*2*1 : The 1*1* is satisfied by the directed 

divergences noted in Proposition 3*2*2 if the joint probability 
distribution P-5«-Q is uniform and R-5iS as in the Proposition 3*2*2* 

P roof : If P*Q is uniform, then 

n . 11 

" I ^j3< " T = ^ ;T = H ^ = i,.*.*n* 

h k=sl n n 

,% The condition of the Proposition 3*2*5 is satisfied* 

Note here that if P*Q is uniform and because R'^S is also 
assumed to be uniform, we have both L and R = 0, proving ttxe 
Corollary* 

C orollary 3*2*2 : If the conditional probability distribution 
(g., ) is uniform then also the 1*1 • is satisfied by the 
divergences of Proposition 3*2*2 when R^S is uniform* 

Proof ! If the conditional probability distribution is uniform, 
then it satisfies the row permutation property* We also have 

n “ n* 

Therefore this corollary can be considered as a corollary of 
Proposition 3*2*2 as well# 

Now we shall s'jmmarlza the cases we have consiaered In 
Propositions 3.2.2 - 3.2.5 In the following theorem. 
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3«2#2 s The 1*1# is satisfied by the following directed 
divergences 

(i) > 0, a 1 

(ii) > 0# cc ^ I 

(•?■?■?> oc^B-i < 1, ^ > 1 or a+^-i > P < 1 

(iv) D®'^, a>i, ^<iora<l, ^>1 

if any of the following ccnditions is satisfied t 

1 • The probability distribution P is loniform and R*S is 
■uniform distribution* 

2 • The probability distribution Q is uniform and R*S is 
uniform distribution. 

3* The conditional probability matrix vi-> ^ow 

j K n xti 

permutation property and R*S is uniform distribution* 

4. If there exists a constant A s*t# E = A v J = i,***,n 

k 

and RJ<-S is uniform distribution^ then only (1) and (ii) 
satisfy ■t±ie I*l* 

C orollary 3*2*3 : The additive directed divergences 

(i) a+p-1 < 1, 13 > 1 or a+3-i > 1, ^ < 1 
and 

(ii) / 0^ > 0# CL ^ 1 

are subadditive for any of the probabili'ty distributions in 
Theorem 3 

P roof ; It is a direct application of Theorem 1* 
with that we conclude this chapter* 



Chaptar 4 


Suba dditivity j Sup aradditivity and 
Measures of Dependence 

4 Introduction : Let X and Y be two randcm variables and let 


Pr(x 

11 

= Pi / 

i. = 


(4-1 ) 

Pr(Y 

= yj) 

= qj . 

j — 


(4-2 ) 

Pr(X 

= Xi, 

Y = y. 

) = , i = 1,2 

f » * * fTClf j = 

1 # 2 / * * 






(4 #3) 

P = 


'•"'Pm 

), Q =; (q^,q2,q2# 


(4*4) 

P*Q = 


Pi 2 ' " • 

'Pin?***'-"'-"' 

Pmn> 


PQ = 


t P $ ♦ 


"• Pn.'^n’ 

(4 #5) 


denote the corresponding probability distributions# Now 
n m 

H Pij = Pi . i = 1,2, I Pij = qj, J = 1,2, ...,n 

3 i 

(4#6) 

so that for a given bivariate probability distribution P*Q, we 
can find the marginal probability distributions P and Q. 

Now let E(P), E(Q), E(P«-Q) and E(PQ) denote the entropies 
of the corresponding probability distributions according to any 
measure of entropy E we may use# 
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ti) The measure is said to be subadditive for P-ifrO if 

E(P*Q) < E(P) + E(Q) (4.7) 

i.a*, if the information given by Pk-Q is less than or equal to 
the sum of the information given by P and Q separately. 

(ii) The measure E is said to be superadditive for P-jeQ if 

E(P#Q) ^ E(P) + E(Q) (4.8) 

i.e., if the information given by P*Q is greater than or equal 
to the sum of the information given by P and Q separately. 

(iii) The measure E is said to be additive for PQ if 

E(PQ) = E(P) + E(Q) (4.9) 

i.e., the information given by PQ is equal to the s\am of the 
informations given by P and Q seporately. 

We make the following remarks on the definitions : 

a ) Subadditivity / Superadditivity and additivity for E are 
defined w.r.to each bivariate probability distribution. P*Q so 
that a measure of entropy may be subadditive (superadditive, 
additive) for some distribution and may not be so for others. 
If a measure is stibadditive (superadditive, additive) for all 
bivariate probability distributions, it is simply said to be 
subadditive (superadditive, additive). 

Then Shannon's measure of entropy 
m 

3 E p^ In Pj, 
i=l 


S(p) 


(4.11) 
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is shown to be additive and sxibadditive for all distributions* 

As such we shall say that Shannon's measure of entropy has both 
additivity and BUbadditivity properties * 

Again we shall show in the present chapter that Renyl's 
measure of entropy 

f ^ a 

R(P) = In £ pf a 1, a > 0 (4*12) 

i=l ^ 

satisfies the subadditivity property for prob* distns* when a < i» 
Then every membei- of Renyi^s family of measures of entropy for 
which 0 < a < 1 is sixbadditive* Similarly it is known that every 
member of Renyi's family of measures (whether 0^(X< 1 ora>i) 
is additive* 

It is also known that every member of Havrda-~Charvat^s 
family of measures 

. n 

H(p) = [ £ pj - 1] a. > 0, U 1 (4*13) 

i=l 

is subadditive for ail a > !■, if only product of two independent 
distributions is considered as P^^'Q* We expect that the measures 
belonging to the H'(P) are subadditive for all a > i for all 
distribubions> but we are unable to prove it* Neither were we 
able to di^rove it by a counter*“example* 

However when a > 1, Renyi's measures of entropy are not sub 
additive* Ei-Sayeed ['5j} gives a numerical example of prob» distn. 
for which Subadditivity condition is violated when a, - 2» However 
his example only shows that Renyi's measure of entropy for a = 2 
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is not subadditive* Renyi's is a family of measures, one measure 
corresponding to every value of 0 G« Different members of the 
family may have different properties# Ihen Renyi's measure not 
being subadditive for a = 2 does not mean that it will not be 
subadditive for all values of cx# In fact we know that Renyi's 
measure is subadditive for the limiting case, when a -» 1* Prom 
continxiity considerations wa can reasonably e 3 <pect that Renyi's 
measure will also be subadditive for all probability distributions 
in the neighbourhood of the parametric value unity# We investigat 
whether this is true and discuss similar questions for super- 
additivity and additivity# 

Similarly for Havrda-Charvats' measure represents a 
family of entropy measures for different values of a 
and some of them may be sxabadditive for some parametric values 
and not for other# We also investigate this problo®# 

Again for the independent variates 

P*Q = PQ (4*14) 

so that E(p5frQ) = E(pQ) , The independence inequality [chapter 3, 

( 3 #1 ) ] requires that for randcm variates which are not 
independent, E(PifrQ) < E(PQ)# If the condition is satisfied we 
can use [e(PQ) - E(P*Q) ] as a measure of the dependence of the 
variates P and Q» 

From (4 #12) and (4 #13) we get 
R(P) =s In £ (1-a) H(P)tl} 


(4.15) 
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R(p)+R(Q)-r(P*Q) = ^ In 


(1-a) "‘H(p)H(Q)-(i-a) (h(p)^‘H(q))+1 


(l-a)H(p*Q)+i 


(4 *16 ) 


where N = (1-a)^ H(P) H(Q) - (i-a)(H(p) + H(Q) ) + 1 

D = (i-a) H(P*Q) + 1 (4*18) 

we use (4»16) and (4 •I?) in the next section to investigate the 
relationship between the subadditivity and superadditivity of 
H(P) and R(P) • 


4 *2 On the relationship between Subadditivity and Superadditivity 
of Renyi^s and Havrda-Charvats^ Measures of Entropy 

Theorem 4'»-2 »1 i If 0 < a < 1 

a) The si±)additivity of H ==> the subadditivity of R 

b) The superadditivity of R ==> the superadditivity of 
H 

Proof : a) Subadditivity of H for P*Q ==> H(p)+‘H(Q) > H(p Q) 

==> (l-a) (h(p)+H(Q) ) > (l-a)H(P*Q) since 0 < a < i 
==> l+(l-a) (H(P)+-H(Q)) > i+Cl-a)H(P*Q) 

==> (l-a)^H{p)H(Q)+(l-a) Ch(p)+h(q) )+l 
> (i-a)H(p*Q)+l 

==> N > D ==> In § > 0 ==> In 5 > o 
==> R(p) + R(Q) - R(P*Q) > 0 
=*> Subadditivity of R for Bt-Q* 
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b) Superadditivity of R for PJ<-Q ==> r(p) + R(Q) < R(P5^-Q) 

i -I N N 

==> ra 5 < 0 => S < 1 

==> (1-a) (H(P)+H(Q) - H(P*Q)) < -(l-a)'‘H(P)H(Q) < 0 
==> H(P) + H(Q) - H(P*Q) < 0 
=-> superadditivity of H# 

Theorem 4«2»2 ; If a > then 

a ) Subadditivity of R ==> subadditivity of H 

b) Superadditivity of H ==> superadditivity of R* 

P roof : a) Subadditivity of R for P*Q ==> R(p)+R(Q) > rCe^-Q) 

r=a ♦a>i 

==> (l-a)^H(p) H(Q)+(i-a) (H(P)+H(Q)) < (i-a)HCP*Q) 

==> (i-a){H(P)+H(Q)-H(P*Q)} < -(l-a)^H(P)H(Q) < O 

b) Superadditivity of H for P*Q ==> h(p)+h{Q) < H(P*Q) 

==> (1-a) (H(P)+H(Q)) > (l-a)H(E««-Q) 

==> (i-a)^H(p)H(Q)+(l-a){H(p)+H(Q)}+l > (l-a)H(P-5^-Q)+l 
==> N > D 

==> In I < 0 ==> R(p)tR(Q) < R(P*Q) 


==> Superadditivity of R 
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Theorem 4«2«3 : 

a) Subadditivity of H for ==> Sxxbadditivity of R for 
P^QifO<a<i and Superadditivity of H for PK-Q ==> 
Superadditivity of R for P*Q if a > 

b) Siibadditivity of R for p*Q ==> subadditivity of H for 
P-M-Q if a > 1 and Superadditivity of R for Pk-Q ==:> 
Superadditivity of H for P*Q if 0 < a < 1 • 

The converses of these results are not true* For, 
consider the following exanple* Theorem 4*2*3 is a restatement 
of Theorems 4*2*1 and 4*2*2 and hence requires no firrther proving* 

E xample 4*2*1 : 

a) Let P*Q = 

Then for a = 0*1, R„(P*Q) - R«(P) - R„(Q) = -0*17759 io"’^<C 

VA« (jU CXr 

and a = 0*1, H^(P*Q) -H^(p)-H^(Q) = 0-75835 > 0. 

R,. - is subadditive and . is superadditive* This is 
U ffX u ♦x 

a covinter-example for the converse of Thm* 4*2*3*a), 
first statement* 

b) Again same distribution as in a) is considered* Then 
for a = 1*285 

R|^(P'5frQ)-RQj^(P)-Rjj,(Q) = 0*1735 lo“'^ > 0,R^ ^235 superadditiv 

Hq^(p«-Q) -Hq^(p)-Hj^(Q) =0*039911 < 0*H^^235 subadditive- 

-% This constitutes a coxanter-exaraple for the converse of 
Thm* 4*2*3*a) second statement* 
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c) Case b) above works as a counter-example for the 
converse of Ihmi 4.2»3,b) the first statement, because 
there, for a = ist285> is subadditive where as is 
super additive i 

d) Case a) works as a counter-example for the converse of 
Him. 4*2 .3ib) the secend statement# because there, for 

a = 0«1, is superadditive and is subadditivo* 

S uba<^ditivitv and Superadditivity of Renvi*s Measure of 
Entropy 

T heorem 4*3*1 : Renyi's Measure of entrcpy satisfies the 
inequalities 

R(p->Q) > R(p) and R(P4{-Q) -> R(Q) for all P*Q (4 ••19) 

Praof : Since p. . = P,.p, , where p, , is the conditional 

i-ij 3>i 

probability that the second experiment results in the jth outcome 
when the outcome of the expt* is known to have been 3^, we get 

S p?, = p^ E p^ . > i = l-,2-> • •*,m (4 *20) 

j=l ^ j=i 

O < a < 1 ==> p^ . > p. i - i-#2-> V .'♦irn 

“ 3 3 / -*- 

n n n 

==> E P. ^ > S Pi. i = i (because s p. , = 1) 
j=l J'- j=l 3=1 

=-> S P?^ > P? i = l,2,«'.-.'#m by (4 #20) 

j =1 ^ ^ 

m n a 

=“> In ( E S P^ J > In E P. 
i=l jsl i=l 
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4 ni n n 

-> r=a Pij > r::^ P? 

==> R(P*Q) > R(p) 

nr ^ ^ Cf ^ rf 

If a > 1, p. . < p ==> In 2 2 pV. < In S p7 

1=1 ja=l 1=1 ^ 

==> 7 ^ In S 2 p . . > r~ In 2 p . 

1=1 j=i 1=1 

==> R(P<frQ) > R(p) 

SO that whether a > 1 or 0 < a < I, R(P*Q) > R(P) • Similarly 
whether a>lorO<a<i, R(P-5«Q) > R(Q) and 


m 

n 

a 

m 

a 

2 

2 

p. . 

"iJ 

< In 2 

P4 

1=1 

j-i 

i=l 

m 

X 

a 

> T“ 

” In 

1 ot 

2 p . 


ij 

Ji,— 

a 

i=l ^ 



2R(P5t-Q) > R(P) + r(Q) 

Generalizing for k-distributions, we obtain 


(4.21) 


k R(P^-» ^2 *^3 * ^ > R(P^)+R(P2) + ***+P(Pj.) 


so that 


R(P^)+R(P2)t...+RtPj^) 


< R(P^ * P 2 * * Pj^) (4*22) 


T heorem 4*3*2 : Renyi's measure of entropy is a monotonic 
decreasing fxmction of os. This result has been proved by Kapur [34]^ 

Now if we denote by R^(P), the Renyi measure of entropy of order a, 
then 


(4.23) 

(4.24) 


nl 

cc 

2 

i=l ^ 


: In m 

a=o 


m 

a 

- 

m 

In S 

- 2 : 

i=i 

1 

i=l 
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t ^ ct 

R^(P) = It In s p = -In (max p. ) (4*25) 

a-*oo -^^ 1=11 i 3. 

^ hgorem 4*3 *3 : Renyi's entropy of order zero is both sub- 
additive and superadditive for all probability distributions* 

P roof : Rq(p) = In m, Rq(Q) = In n and R^(P'5S-Q) = In (mn) so that 

R^tp-Js-Q) = In(mn) = In(in) + ln(n) = R^(P)+R^{Q)* Therefore the 

inequalities 

Ro(P-»Q) < RqCp) t 

R^(P*Q) > R^(P)+R^(Q) are both satisfied for all prob# distn. 

T heorem 4 *3 *4 : Renyi's entropy of order unity is subadditive. 

P roof : By definition* Renyi's entropy of order unity is same 
as Shannon's measure of entropy which is known to be both sub- 
additive and additive. 

Theorem 4 *3 -.5 s Renyi's entropy of order «• is superadditive or 
subadditive accord.lnq as 



P roof t Roo^P^ = -In (max ) * R^'(Q)= -In (max q .) and 
- i “* j J 

R^(p Q) = -In (max P.^) so that R^ is superadditive or sub- 

additive according as 

R^(P^^Q) ^ R^(P) t R^(Q) 

i.e.-* according as 

max p^;. $ max (p^j^) max(qj) • 

1, j ^ i j 
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We shall illustrate theoran 4*3*5 by the following example* 

jj xample 4 <3 *1 : Consider the following probability distributions 

I. = 0-1, = 0.5, Pj^ = 0.3, Pjj = 0.1 

= 0*6, p 2 = 0*4/ = 0*4/ q 2 = 0*6* Then we have 

max(p. .) = 0*5/max(p. ) = 0*6/ max(q.) = 0*6* In this case 

i,j i j 3 

Renyi's measure of entropy of order is subadditive. 

^11 ~ Pi2 “ 0*3/ P2^ = 0*3/ P2^^ = 0*3 

Pj^ = 0*4/ P 2 = 0*6/ = 0*4/ q 2 = 0*6* Then we have 

max(p. .) = 0*3/ max p. = 0*6/ max(q.) = 0*6* In this case 
i/j i j ^ 

Renyi's entropy of order «> is superadditive* 

III* p. . = 0*25, i = 1/2, j = i,2;p. = 0.5/q.=0*5,i=l/2/ j=l/2. 

1 j 13 

Then we have 

max(p. .) = 0.2 5/ max p. = max q. = 0*5* However in this case 
i.j i j ^ 

Renyi's entropy of order «> is both subadditive and super- 
additive* 

We shall categorize the probability distributions P Q 
into two types : 

T ype I : for which max < max(p^) maxCq-) and 
~ i/j i j 

Type II : for which max > max(p^ ) max(q^)* 

— 1,3 " 1 ^ J ^ 

The range of values of a for which Renyi's measure of 
entropy of order a is subadditive or superadditive depends on 
whether P Q belongs to Type I or Type II class of distributions* 
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For indapendent distributions# p. . = p.q. and as such 

1 j r 3 

Tnax(p. .) = max(p. ) max(q.) 
i^j IJ i i j 3 

therefore all the product distributions are classed in Type II* 

T heorem 4*3*6 : Renyi’^s entropy of order a is a concave function 

of P when 0 < a < i and is a pseudo-concave f\incticn of P when 
a > 1 * 


P roof : a) 0 < a < Wa have (p^) = a(a-l) p'^'"*'^ ==> p’ 

is a concave fiinction of p. ^ i - 


<X— 2 


a 

i 


m 


ct , 


-> S p. is a concave function of P 


±^l 


m 


a 


> In 2 p. is a concave function of P# 




m 


— > In ( 2 p.) is a concave function of P 

-i=l 

i,e».# Renyi's entropy is a concave function of P* 

When a > 1# 


--J- ip^) = a(a-l) ==> pf is a convex function of p^ 

dpj "■ 

m ^ 

==> 2 Pi is a convex function of P 


i=l 


m 


> In 2 p"^ is a pseudo-convex function of P * 


i=l 


m 


In 2 p*^ is a pseudo-*'concave functton of P. 
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i.e., Renyi's measure of entropy of order a is a pseudo-concave 
function of P when a > 1 . 

From the above discussion it appears that 

(i) F(a) = Rq, (p)+-R^(Q) and G(a) = Rq^Cp Q) era both monotonic 
decreasing fxinctions of a as a varies from 0 to <»• 

(ii) Both functions start at the common value In(mn) at a = 0. 

(iii) At a = i, F(i) = It F(a) > G(1 ) = It G(a), since 

a-*l “ a-»l 

(P) -5- R|^(Q) as Shannon's measure of entropy 
is subadditive# Also P(l) = G(1 ) iff P and Q are 
independent, otherwise F(a) < G(a)# 

(iv) As a -* co/FCoo) ^ G(<») according as Q) ^ • 

(v) The graphs of P(a) and G(a) intersect at a = 1 if P and Q 
are independent, otherwise they intersect at a = a 
where a > 1 # 

We have drawn the following graphs G(a) - F(a) vs oc 
for some ej^erimental distributions fron both the types# 



Pig* 4-3.1 ^ig» 4 *3 #2 


Rj(P Q) < R^ 






•--'fv'V- R.' : f 

Fig» 4*3*3 

Here in Fig. 4-3*1/ G(a) - F(a) intersects the a-axis 
at no point between 0 and 1, and intersects at between 1 
and oo for type I distributions so that for all values of a 
lying between 0 and ct^it is subadditive and for a > it 
is superadditive* In Fig* 4*3*1/ we have considered the 
following three type I distributions : 

(1) : P*Q = (q^ 285 0*615^' ^ ~ (0*l/0*9) and Q 

(2) = P«Q = P = (0.1,0.9) and Q 

(3) = P*Q = (°;°“ ; °;°”), P = (0,1,0,9) and Q 

In Fig* 4*3*2 we considered a type II distribution* 

P*Q = ' 2*2^)/ P = (0*1, 0*9) and Q = (0*15,0.35)* Here we 

UoX ^ U#o 

get an lying between 0 and 1 so that for ct < a*, H^(P Q) is 
superadditive and for all a > a^, R^(P Q) is subadditive* 

Again we considered the following type II distributions s 


= (0.29,0*71) 
= (0.29,0,71) 
= (0.29,0.71) 


(1) : P*Q = (2*0 P = (0*.6,0*4) and Q = (0.4, 0.6) 

U # O U # JL 
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(2) : PJ<-Q - (q*q 2 olsa^ ' ^ “ {0*i,0»9) and Q = (o*i#0#9) 

then we observe that no (X„ exists in these cases and R (P) is 
always subadditive* 

However our above discussion does not rule out the 
possibility of an even number of points of intersection 
between G(a) — F(a) and the a-axis between 0 and 1 and an 
odd number of points of intersection between 1 to for 
other type I distributions* However in view of the inonotonic 
and concavity characters of F(a) and G(a) this possibility is 
evidently unlikely* The large nximber of calculations we have 
carried out also suggest that for Type I distributions, there 
is only one point of intersection a^, where > 1# 

So it has been established that for every distribution 
P*Q there is a certain interval including a = 1 in which 
Rq^(p) is subadditive* We now tabulate sane of our niimerical 
computations * 

We observe that for the first two distributions in 

Table 4*3#2, R„(P) is subadditive for all a whereas in the case 
cx 

of third distribution, it is subadditive only for cx > where 

0*i < < 0*2 and superadditive for all a < In the case 

of the fourth distribution, a* lies in the interval (0#52,0*553 . 

The following table comprises of the critical values of a 
for type I distributions i 
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SI* 

No* 

Joint 

distribution 

■Marginal 
d i s tribu ti ons 

Interval 
‘ containing 
. a 

GCa) - P(a) 

1 » 

0*005,0*095 

(0*1, 0.9) 

(1.280,1*285) 

-0.2934 X10“'^ 


0*285,0.615 

(0.29,0.71) 


0.1735 X10“^ 

2 . 

0.010,0.090 

0.2 30,0.620 

1 

o 

J 

(1 .225,1 .230) 

-0.2215 X10~^ 
0*1083x 10”^ 

3. 

0.015,0.085 : 
0.275,0.625 

-do- 

(1.170,1.175) 

-0*1223X10"^ 

0*4011x10”^ 

4 • 

0.020,0.08 

0.270,0.63 

-do- 

^ (1.115,1.120) 

-0/.4335 XIO”^ 
0*9384 xlO“^ 

5* 

0*025.0.075 

0.265*0*635 ^ 

-do- 

(1.050,1.055) 

-0.2578X10“'^ 
0,1597 xio"'^ 

6 * 

0.120,0.230 

0.230*0*420 

(0,35,0*65) 

(0*35,0.65) 

(1*060,1.065) 

-0.2190x10’"^ 4 

0.2 563 x10"^ 

7. 

0*080,0*220 

0*220,0.480 

(0*3#0-.7) 

(0.3, 0.7) 

(1.150,1*155) 

-0.1490 xio"^ 
0.2644 x10*"^ 


Table 4*3»1 


In table 4*3*1/, the value a„ lies between the two values 
in the 4th coltmn* When P and Q are independent, !■/ when 

P ond Q are nearly independent is slightly greater than unity 
and when P and Q are far fron independence a* differs from unity 
significantly^ In fact (a^'*l) can be used as a measure of 
dependence b©tx>reen P and Q via P*'Q* 

In fact when a = 1> + R^(Q) i^'(P*Q) is itself 

taken as a measure of dependence* However when oc > 1, 
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R^(P)+R^(Q) - R^iPurQ) can be negative and as such the value of 
for which this vanishes for any given distribution p Q is 
taken as a measure of dependence# 

The following table comprises of critical values of cc for 
type II distributions ; 


; SI# 
No*; 

Joint 

Distribution 

Marginal 
. D i s hr ibution s ‘ 

; Critical 
' Interval 

G(a) - P(a) 

1. ■ 

0.1 , 0#5 

0#3 , 0#1 

.(0*6 , 0.4) 

(0*4 , 0.6) 

Doesn't 
r exist 

a = h ,-0*957 X iO"^ 

a = 2 ,-0.2862017 





'a = 0.8,-0.1470573 





a = 0*1,-0.17759 x10“”^ : 

2 # 

0*08^0*02 

i 0*02,0.83 

: (0*1, 0*9) 

; 

(0*1, 0*9) 

Doesn't 

exist 

a = I, -0*12406 
a = 0*1,-0*231189 xio"*^: 




' 

■ 

CL = 0 # 8 ^ -HD ♦X 6 S 58 




. 

a = 2*0,-0.1504919 

3 # 

0 #1 , 0»1 

0#1 , 0*7 

(0.2 , 0*8) 

(0.2 , 0.8) 

[ {0#i/0#2) 

p 

L 

a = 2, -0.1174 
a = |, -0.163518X10“^ 
a = 0*8,-0.419381 x10“^; 
a = 0*1, +0*56605 X 10 ^ 
a = 0*2, 0.96762 x 10“^ 

■ 4* 

; 0*05,0#05 i 

0*1 ,0*8 

(0*1 ,0*9) ^ 

(0#15,0*S5) 

to#52,0*55) 

a = 0.1 ,0.1081737 
a = 0*5 ,0.21524 X io“^ . 
'a = 0.52,0.0007 





O 

a = 0*55,-0*00157 

a = 0.6,-0*55936 x10“^ 

a = 0*9,-0*31620x10“^ 

-1 ? 

a s 2*0,-0*6970 xlO * 
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4 *4 Subadditivity and Superaddltivity of Hayrda-Charvats ^ 
measure of entropy 

T h eorem 4 »4 *1 : Havrda-Charvats^ measure of entropy satisfies 

the inequality 

H(P*Q) > H(P) and H(P-KQ) > H(Q), (4#27) 

P roof : It is similar to that of Theorem 4*3*1» 

T h eoran 4 *4 #2 : Havrda-Charvats' measure of entropy is a 
monotonic decreasing function of a# 

For proof/ refer Kapur [34]* 

T heorem 4 *4 <3 : Havrda-Charvsts' measure of entropy of order 

zero is superadditive# 

P roof : H^CPifrQ) - H^(p) - H^(Q) = mn-i-(m-l )-(n-i ) 

= (m-i)(n-l) > 0 


because m > 2/ n > 2# 

HqCp^^Q) > H^(p) + Hq(Q). 

T heorem 4»4«4 : is subadditive# 

Proof follows jLmmediately because H^(P) is Shannon's measure of 
entropy which is subadditive# 

T heorem 4 #4 #5 : H^(p) is both superadditive and subadditive* 

P roof : We can easily see that 

H^(P4«-Q) - H^(P) - H^(Q) = 0# 

Theoran 4*4*6 : Havrda-Charvats' measure of entropy of order a 


is a concave function of P for all values of <x» 
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Refer to Kapur [34 ] for proof. 

Because Havrda-Charvat's measure is superadditive at a = 0^ 
and is subadditive when cc > 1/ it must have changed from being 
superadditive to being subadditive at some a* between 0 and i . 


The following table contains our numerical work with 
H^(P*q) - Hq^(p) - H^(Q) : 


SI. 

No. 

Joint 

Distribution 

Marginal 

Di stributions 

' Critical 
interval 
of a 


1 . 

0.005,0.095 

(0*1 ,0.9) 

0.335,0.840 

0.1711x10”^ 


0.235,0.615 



-0.619 Xl0“^ 

2 # 

0.010,0.090 

;(0.i ,0.9) 

0.920,0.925 

0.641 X lo""^ 


0*280,0.620 

: (0.29,0.71) 


-0.749 X 10 

3. 

0.015,0.085 

(0.1 ,0.9) 

0*960,0.965 

0.605 xlo"^ 


0.275,0*625 

(0.29,0.71) 


-0.2 51xl0”^ 

4. 

0.020,0.080 

(0,1 ,0.90) 

0.985,0.990 

0.313 xio”^ 


0-270,0.630 

(0.29,0.71) 


-0.531 X 10 

5 . 

[ 0.025,0.075 

(0.1 ,0.9) 

i 

0.9975,0.993C 

^0.298 x10""^ 


0.265,0.635 

! (0.29,0.71) 


-0*745X10'"^ 


■ 0.030,0.070 : 

> (0.1 ,0.9) 

0.99960 

0.745 xlO*"^ 

o ^ 

0.260,0.640 

i (0.29,0.71) 

0.99965 

-0.119x10“^ 


0.150,0*2 50 : 

; (0.4 ,0.6) 

0.9980 

0.373 xio”"^ 

/ # 

0.2 50,0.350 ; 

! (0-4 ,0.6) 

0.9935 

-0.198 Xlo”^ 

o 

0.200,0.250 ; 

: (0.45,0.55) 

0.99975 

0.596 x10“"^ 

o ♦ 

i 

0*250,0.300 

(0.45,0.55) 

0.99980 

-0.745 XI 0“^ 

\ 


Table 4.4.1 
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Prcxn Tables 4*3»1 and 4 •3*2 we deduce the following 
table : 


SI * 
No* 

1 


3 

4 

5 

6 

r—— — j 

' 7 

a* 

0*S38 

0.920 

0.963 

0*987 

0.957 

0*9 576 

1 *0000 


1*285 

1.2 30 

1.175 

1.125 

1 .065 i 

1 3053 

1 .0000 

1-a* 

0.162 ; 

; 0*080 

: 0.037 

0.0130 ; 

1 

0*0030 

0.0024 : 

0 

a^^-i ■ 

! 0.285 ^ 

^ 0.230 

i 

0.175 

0.125 ' 

0.065 

0.053 

0 

t 


Tabl e 4 *4 • 2 


We find that both l-a^'’and a -1 decrease or increase 

TT 

together and either can bo used as a measiire of dependence^ 
between P and Q viz P-J^Q* We describe the situation in Pig»4»4»^l* 


Havrda~Charvat 

Renyi 


Superadditive Subadditive 



Subadditive Superadditive 


1 a* 


Fig* 4*4*i* 


In general the portions of subadditivity and super- 
additivity aro given in the above figure^ for type I distributions 
a = 1 is always included in the subadditivity range and 1-a* 
and a —1 can always be used as measures of dependence, a'**' and 
of course depending on P4(-Q» 

For H„(P) our classification of the bivariate probability 

(J* 

distributions into type I and type II has no relevance* That is 
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so because irrespective of the type of the distribution H (p) 

oc 

is subadditive for a = 1 and superadditive for a = 0, thereby 
assuring us of atleast one oc*/ O < ct* < i» Though we have been 
unable to rule out the possibility of more than one a'^‘ and 
we can reasonably consider that this possibility is highly 
unlikely# Even if there exist more than one a* or we may 
consider a* and nearest to unity for finding the degree of 
dependence between P and Q in P*Q# 

A consequence of Thsoron 4 •2*1 (a) is that since is 
subadditive in (a*/i ) R should also be subadditive in (a*,l)# 
Similarly we deduce fron Theorem 4#-i*2(a) that s’obadditivity 
of in (l,a^) means siibadditivity of there# Then in 
Ca'^, a^) both Hq. and R^ are subadditive# Note that this holds 
good only in the case of type I distributions# In the case of 
type II distributions# if exists# both a* and are less 
than 1 « There' s noway of determining which of them is larger* 
Thus for type I distributions# we have established a conmon 
interval for a# containing 1 where and are both subadditive# 


4*5 


Subadditivity and Superadditivity of Kapur-Aczel-Daroczy 
Measxare of Entropy 


n 






n 

S P' 
i=l 


} a 1, 




(4.28) 


If ^ =1# (4 #28) reduces to R^(P)* So all the results obtained 
in section 4 #2 and 4 #3 follow as a special case for ^ = 1# 
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We have performed some calculations for this measure using 
type I distributions* Oxir results are tabulated below# For 
meaningful definition of this measure we should have either 
GC > 1 , 3 < 1 or a<l^ 3>1# Confining ourselves to this 
restriction we have fixed one parameter (a or 3) and found the 
value of the other parameter up to which ^ is subadditiva, 
for some of the type I distributiotis we considered for 
Rqj^Cp) andH^(p)# 

Distribution : P»Q = (oljsl;?!?!!’ ' 

F ixed 3 t 



a ' 


Inferences 


r ' 



1*3 

0.65 ^ 

-0.11691X10"^ 

For ^ = 1.3, p is subadditive 

n 

0-70 i 

0.40789 xio""^ 

. -3 

for all a < 0.6 5, superadditive 

for 0.70 < ct < 1* 

; 1 «4 

1 0-50 

-0,16493 X10 

For ^ = 1.4.H p. is subadditive 

Gu^ p 


0.55 ^ 

0-14006 xiO 

-3 • 

for a < 0.5 and superadditive 

for 0*55 < a < 1* 

1 >5 

' 0.30 : 

■ -0.95342 XlO ; 

' -3 ' 

For 3 =! 1*5,H„ o is subadditive 
a,p 


0*35 

: 0.63733 x10 

for a < 0#3 and superadditive 

: ^ 

■ 



for 0*35 < a < 1. ; 

For all 3 > 1-5, H « is always / 

CXjf p 

superadditive for 0 < a < 1 » : 

- , r--- ■ 


Table 4*5*1 
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Fixed a 


a 

T--- 


Inferences 

0 mX 

1*65 

-0*18627 X io“^ 

For a = 0*1, H „ is subadditive 

Ct/P 


1 .70 

0.744 55X io”^ 

for 1 < ^ £ 1*65 and superadditive 

for jS > 1*70, 

0,5 

! 1*45 

: -0.164 86 x10"^ 

For a = 0*5, H o is subadditive 


1*50 

0*2 5758X10“^ 

for 1 £ 0 < 1,45 and superaddiuive 

for ^ > 1*50 

' 0*9 

1 *20 

-0,22330 X10“^ 

-3 

For a = 0*9, „ Is subadditive ‘ 


1 *2 5 

0.70753x10 ^ 

for 1 £ 3 < 1,2 and superadditive ^ 

for 3 > 1*2 M 


Table 4*5*2 


Distribution : P*Q = 

0*2 0#62 

F ixed j3 : 


. ^ ^ 

a 



Inferences 






'l .3 

0*60 ’ 
0-65 ^ 

1^' ' : 

-0.33604 xio 

0.2 3051x 10*"^ 

; For 3 = 

' 

. 0 < 0^ < 

1,3,H„ a is subadditive for 

P 

0*6 and superadditive for 




0,65 < a £ 1*0. 

1 ,4 ' 

0.45 ' 

0.50 ’ 

-0*63866 x10“'^ 

■ 0,10350 Xio"^ 

^ For 3 = 
for 0 £ 

i,4,H a is subadditive 

a,p 

a £ 0,45 and superadditive ' 




for 0*5 

< a < 1 ' 

1*5 j 

. 1 

D *2 5 ^ 

0*30 i 

.-0,49169 X lo“^ 1 

0.58626 Xio“^ ^ 

For 3 = 
i for 0 £ 

i,5,H„ a is subadditive 

' a,3 . 

a < 0*25 and superadditive 

j 

j 


j 

r for 0*3 

£ a £ 1*0 




' For 3 > 

1*5,H_ Q is sxabadditive • ; 

I ^ 


Table 4*5*3* 
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Fixod a : 


a 



Inferences 

0^5 

X.40 : 

-0 <.0994 83 X lo“^ 

For a = 0.5/H„ a is subadditive 

1 

1 *45 

0.10350 X10“'" 

for 1 <. 3 < 1.40 and is super- 

0 »9 

i #20 

-0.54 780 Xl0“^ 

additive for 1.45 < 

For a = 0.9, Q is subadditivc : 

a,^ t 


1.25 

0*15460 X10“^ 

for 1 »0 < JS < 1*20 and is super- | 




1 additive for 1.25 ^ 3. | 

For a < 0.5 also « is super- I 

a,^ - j 

additive for ^ > 1. | 


Table 4 •5#4 • 


With that we conclude this chapter* 





Chapter 5 


Two Optimization Problems 
in 

Information Theory 


I n troduc tion : 

A probability space is one in which a point P has n 
coordinates Pi /P-, / • * where 

n 

Pi > 0# p > 0, ..*,p > 0 and 2 P4 = 1 {5#i) 

A •“ 2 *“ n — ^ ^ 

so that every point represents a probability distribution* The 
distance of a point P from a point Q = ( <3^ / * » is defined 
by the KullbacTc-Leibler 2 measure of directed divergence : 

n p. 

DCP:Q) = 2 p. In ^ * (5*2) 

i=i ^ % 

This distance is not symmetric with respect to P and Q 
and it does not satisfy the triangle inequality* Inspite of these 
two weaknesses, the geometry of the probability space characterised 
by the distance function (5*2) is both interesting and useful* 

The distance D(P:Q) represents the directed divergence of P fr<xn 
and plays important role in information theory and its applications 
in various fields [60,32], specially through its role in 
Kullback's [3] principle of minimum discrimination information* 
D(P:Q) is a convex fxmction of both 

and is thus very suitable for use in optimization problems# 
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The Special fionctional form (5»2) is particularly useful since 
it involves the logarithmic fxinction which is real only when 
the variable is greater than zero. This ensures that the 
answers we shall get to our optimization problems will be 
positive and will usually belong to the probability space 

In this chapter x-jb consider two optimization problems* 
First we use (5.2) as the distance function. Later we shall use 
other measures of directed divergence instead of (5.2) and 
solve the problems. But before we introduce the problens/ we 
consider, in the next section, sane results about the geonetry 
of the probability space. 

5 »1 Geometry of the Probability Space 

There are n special points in the probability space, viz. 

I^ = ( 1 , 0, 0, » • , o) 

lo “ (o,i^0j^..»,o) 

^ (5.3) 

» m 0 » 0 

m ^ » 0 

0 0 r * 0 

^n ~ (o, 0, » » • , 0,i ) 

Corresponding to n degenerate distributions. Each of these 
distributions represents a state of certainty. Also 

D(I^/P) = -In p^ , k = i,2,..vn (5-4) 

represents the distance of the b state of certainty frcnr the 
probability distribution P. It depends only on P 3 ^ and in seme 


i 



158 


sense it. gives the measure of uncertainty associated with the 
'tin 

^ outcome of the distribution P. The average of all such 
uncertainties is 

n 

H(P) = - S p. In p. (5.5) 

i=l ^ ^ 

which gives a measure of vincertainty associated with the prob# 
distn. P. This is defined as entropy of P- 


Uncertainty is minimum when P coincides with any of the 
points (5.3) and is maximum when p^^ ~ P2 ~ ~ ^n “ n 

maximum value is equal to In n. 


Fig. 5ii gives the gecmetry of the probability space for 
n = 6 where the numbers within squares give entropies and 
nxambers along the directed lines give directed divergences 



The minimum distance between two points P and Q is zero 
and arises when P = Q. For finding the maximum distance we keep 
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n 

Q fixed and note that J) p. In p. /q. is a convex function of 

i=i ^ 1 i 

whose maximum value can occur at one of the n points 
***'^n* values at these points are -In q^,«..,-ln 

so that the maximum value of D(P:Q) is -In where is 

the minimum of q-j^#q 2 # •»*/q^* As the ■* 0/ this distance 

approaches infinityand as such the distance between two points 
in our geonetry can be arbitrarily large* 

If P and Q are two distributions then the set of distns# 
KQ + (l— A) p, 0 < X < 1, will be said to constitute a line 
segment# Ifx>l or X<0 this may not be a prob# distn# and 
as such this does not give a straight line# This is unlike the 
Euclidean case# 

The set of prob# distns* P satisfying 
n Pi 

S p. In ^ = K (5.6) 

i=l ^ ^i 

where Q = {q^,###,q^) is a fixed prob# distn# with each q^^ > O 
gives a 'hypersphere' in the probability space# with 'centre' 

Q and 'radius' K. However this defines a hypersphere only when 
K < -In If K = -In there is a single point and if 

K > -In the hyper sphere is imaginary# 

Similarly 

n p. n p. 

S p. In ^ + S p. In ^ = K (5#7) 

HI ^ ^ JL 

where Q = with q^ > 0 V i = l,#.»#n and R = 

(ri#*.*#r )# r# > 0 ¥ i = i#*>.,n are any fixed distributions# 

JL Xtl 3* 
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defines a h^er-ellipaoid with foci at Q and R# This will 
not exist for ail values of K» We investigate in section 4 
the values of K for which prob* disrn# satisfying (5*7) exist* 

Optimization Problem 

This problan is concerned with finding the maximum and 
minimum values of 

e = D(P:R) - D(P:Q) (5.3) 

where Q and R are fixed distributions for v^ich each componen-c 
is greater than zero* 


The motivation to this problem arises frcm a problem 
solved by Kullback [3] who found the distn. P which minimized 
D(P;R) s.to. D(p?r) - D(P:Q) = 6 where € is a fixed constant. 
He discussed the solution for © lying between 0 and but his 
solution is valid for a larger range of values of 0» We want 
to find this range precisely 
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n n 

Since S <3^ = 1, Sr£ = i and Q R there are one or 
i=i i=l 

more which are greater than the corresponding r^^'s and so 

^^-^^i^max ^ ^ ^®^max > Moreover, 


In 



max 


^ q. n q . 

E q. In (—)_„ >2 q. In ^ = D(Q:R) (5*12) 
j=l J max j r . 


Thus we have 


[d(P:R) - D(P:Q)]j^^^ > D(Q;R)* (5.13) 

( 5*13) is a weaker form of triangle inequality according to 
which in Euclidean geometry 


D(P:R) - D(P:Q) > D(Q:R). 

(5.14) 

n 

Now we minimize D(PtR) - D(P:Q) subject to 2 Pi = 

i=l 

1* We write 

n r. 

D(P:R) - D(P:Q) = - 2 p . In — 

i=l ^ % 

(5.15) 

[d(P:R) - D(P:Q)]^„ = -^'^>n.ax 

(5-16) 


Again rjCln = -D(R:Q> 

(5.17) 

Thus D(P:R) “ D(P:Q) < -D(R!Q). (5.18) 


In fact inequality (5.18) can be deduced frcxn (5.13). 
For. by interchanging Q and R in (5.13). we get 


[d(P:Q) - D(PsR)]jjjg^ > D(R;Q) 
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or ~ [D(P:Q)--D(P:R)]^^^ < -D(R:Q) 


or [d(P:R)~d(P;Q)]j^.^ < -d(R:Q) 
which is same as (5<.18)* 

^■i 

Thus the iT.ax« value of 0 is In which is > DCQiR) 

X?^ iTiaLX 

and the minimum value is ^i'^^i^min is < -D{R:Q). The 

maximum value arises for that degenerate distribution P for 

q. 

which that component is 1 for which (— =•) is maximum and the 

^i 

minimum value arises for that degenerate distribution for which 
that component is unity for which Cq^^/r^) minimum. 


q^- 


Thus Kullback's problem can be solved when 6 lies between 




max 


An alternative proof of the above result can be obtained 
by attempting to solve Kullback's problem* We do it in the next 
section* 


^ ^ An n Ite m a tive proof of the results regarding the Maximum 
and l^in’lmum Values of 6 


We mj.nimize 


n 

S Pj 

i=l 



subject to 

n P 

S pj In - 

x=l 

and 


i 

i 



Pi In 



(5.20) 


( 5.21) 


n 

S Pi = 1 
i=l ^ 


to get 
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a _1 

^1 i 


-a 


^ 2 ct 1 — a 

S q. rt 
1=1 ^ ^ 


where a is determined from the equation 


(5*22) 


S a 1-a , % 

S rt In -r 

i=l ^ ^ ^i 

^ ot 1 

i=l ^ ^ 


= e 


Let f(d) = S q^ ^i*^ 


i=l 


n rr % 

Then we have f'Ca) = S q, r. In ~ 

i=l ^ ^ 1 


f*(a) = S rf*^ (In 

4 _•» J- i ^4 


n 
S 

i=l 

f(0) = 1, f(l) = 1 


n q. 

f'(0) = S r . In ^ = -D(R:Q) < O 
i=l ^ ^i 


n q. 

f'(l) = r q^ In ^ = D(R:Q) > O 

i=l i ” 


D( R' 


(5.23) 

( 5.24) 

(5.25) 

(5.26) 

(5.27) 

:Q) > O 

( 5.28) 
( 5*29) 


end f(a) is a convex function with graph givoi in Fig. 5.2 and 


is determined from 


f'(a)/f(a) = e 


( 5.30) 


Also f(a) has a minimum at a = a where 0 < < 1. 
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Fig. 5.2 


f(a) f"(a) - f'^(a) 


f (a) 


( s q“ri-“)( E q?r"““(ln %h-( E In 

i_l 1 1 i=i ^ ^ i=l^ ^ 


C E 4 

i=l ^ 


(5*31) 


Here we have used Cauchy'' s inequality. Therefore © increases 


with a and 


a - n- R - 1+. £*(0-) 

®max “ f ( a ) ' ®min ~ f(a) 

(X.^— oo 


( 5*32 ) 


that 


^ a l-a , 

S q. rt In 


i=i 

e . =5 Lt 

a^c» ” a i--a 

and 


« ^ „x-u, , _x 

i r. q. 

;:;1 = In(^) 

" a l--a ""i 


(5.33) 


^ a iMX -, % 

S q* r . In ~ y, 

• ^ 1 r. -‘-.j 

®"»in - a^-» 2 a i-a % ■"®=' 

S q. r. 

i=l 


(5.34) 
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Also (e)cc=i.= Irif^ “ d(q;r) (5.35) 

(e)a=0 " In^ = -DCRtQ) (5*36) 

Since 0 Is a monotonic increasing function of a. It 6 > f(l) 

cx-*«> 

and It 6 < f(o) and so all the results of the last section 
follow* 

5*4 Second Optimization Problem 

This problem is concerned withflnding the maximum and 
minimum values of 


<P = D(P:Q^) +X 2 D(P:Q2) 4* **. + D{PtQ^) (5-37) 

m 

whsiTB X. > 0 V J = S X» 1 (5#38) 

^ j=i ^ 

and Qj = ^ J = i#--.*m (5*39) 

are given probability distributions with each > 0* Thus 
this problem is concerned with finding the maximum and minimum 
values of the weighted sum of directed divergences of P from 



m n 

Now = 

S X. s • 

j=l J i=l 


m n 


m 


n 


- S X^ S In q.. 
j=l ^ i=l ^ 


n 

S p. 

i=i ^ 



S p; In C ) - In ^ 

i=l ^i'^^ 


(5.40) 
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where 


% 


S 1 

n qJ 


j=i 


■ji 


(5^41) 


is the weighted geometric mean of the i^ components of the prob« 
distns* '^1*02* 


n 

e = s \ 

X— X 


( 5.42 ) 


% "^2 % 

so that #•••/ is a probability distn. The minimum 


n P. , 

and maximum values of S p^ In ( • ) are zero and In 

i=i 1 








Thus the minimum value of (P is -In ^ and it occurs when 
P -5 = q.j/ qj and the maximum value of (p is In = and it 

occurs when P is the degenerate distn# whose only non— zero 
component is that for which is minimum# Now 


n n m , m n X. 

3 = S q . = S rr q.^ < E ( S g . . ) ^=l^-ln ^>0 [gM < AM] 
i=l ^ i=l 

( 5*43) 

so that as expected, both the maximum and the minimum values 
of <p are positive. 

Now we shall use other measures of directed divergence 
as distance functions in solving these two optimization problems. 


5.5 First Problem Using Generalized Measures of Directed 
Divergence ; 


5.5.1 Havrda-Charvat Measure of Directed Divergence 


d“(p=q) = jir 


n 

S 

i=l 


a i-a 


Pi % - 


1 


a 1 


(5#44) 
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is the new distance measure in the probability space* We shall 
maximize and minimize ©q = 5 ^ - 4 "^) (5*45) 

l=i 


n 


subject to E Pi = 1 # 

i=l ^ 

Consider (r^“°^-q|"^) , i = 1^2, *.** 0 * Let i = k be the 
index for which (r^~^-*q^”*^) attains the maximum value* Viz# 


/ X "KX 1 \ r 1 •*(X 1 --{X \ 

-4 ^ ~ 4 . ^ 

Then we have 


<3 5#46 ) 


^^k ^k ^ ^“1 ^ ¥i = l^#**,n 


n 


Case Ci) I Now let a > 1 , then from (5*47) we get 


■i “^i 
n 


a / 1-a l-KXi 


( 5*47) 


n 


n 


1 -m — ^ 1 - V _l~oc 


Ta-l) ^i'^k 
x=x 


a . . . a 


But for a > 1, <_ 1 , /. > (a-it s p“ 


i=l 


(5*49) 


Now from (5*48) and (5*49) we get for a > 1 

1 —ot i-ct 
^k "”^^c 

max(0Q) = — — and it occurs for P = 

Case (ii) : Now let 0 < a < 1* and let i = s be the value of the 
index for which (r^"^-q^’^) is minimum# That is we have 
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^ ^ ^ ^ 


n 


n 




(5*50) 


n 


Now because 0 < a < 1 , we have (a-l) < 0 and S p? > 1* From 

i=l ^ ” 

( 5 • 50 ) we get 


n 


A _ Cl / „l—oc i-a% ^ X „ a/ x—a i— a\ t 

a-r ^ - a-l .J. ^ 


n 


.^s :“Js. - . a “^s ^ 


( 5.51) 


(5.52) 


From (5.51) and (5.52) we get for 0 < a < 1(6^ max^ ~ 
_i “Ct«ql “Ot 


— OG^^l occurs for P = 1^. 


Now we shall find the minimum value of 6^. Consider the 
following two cases : 

j-l-OC 1-a j-l“a„q.i"CC 

Case (iii) : Let a > 1 and min( ^ ' ) = ( — ) < q . 


a-l 


a-l 


H 


(5.53) 

n n 

ere we note that S rj = S q.. = 1 and therefore some q.'s are 

i=l i=l ^ ^ 

greater than the corresponding r^^'s and the rest of <j£*s are less 

than the corresponding r^'s. Therefore ( ) is non- 

positive. Therefore we have from (5.53) 
n ^ r^-oc 1-a 1-a 1-a. 

£ Pi > ( E (^a-i) - (a-l) 


(5.54) 
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The minimum value of 6^ is C — ) and it occurs P = I . 


a-1 


m 


„l-a 


i-a ^ 

Case (Iv) i Let 0 < a < i and max ~ — ) = — — > 0« 

“ ,• a-1 (a-i) “ 


(5,55) 


Therefore 

— (X 1 “<x 




S pH 

i=l 


a- 


n T'^“®_rr^“CC i-a i-a 

_ia ^ . / -r „a^/-^M % - \ , 1-a “M *“% 

i ' ^ i=/i ^ - " - JS-i ' ! 


(5.56) 


n 


because for 0 < a < 1, S is convex and hence P = (— ,~, , 9 *..— > 

a 

minimizes S Pv • But this value is never attained by 6 . So 
i:=l 1 O 

we have been able to obtain only a lower bound for the minimum 
of 6 q when 0 < a < i , We shall now summarise our results of 
subsection 5.5»1. 


n 


Max(eQ) : (a > 1) 


= A Cr|-^-qf ^;) 


(0 < a < l)= 2 ^ min ( 

iMX 1-a 
r . — q. 


i-a i-a, 


rT -qr 

Min(e ) : (a > 1) =s min — ) 

o j a— i 


.-a 


(0< a < 1).> n^"^ max(-i~-~— .) • 


•i 




5 .5 *2 Kapur* s Measure of Directed Divergence 


n 


p. . n (1+ap. ) 

'^0. , 1 n 1 


Here D(P:Q) = -D^(P':Q) = p-j_ in ^ (H-ap^.)ln ‘([Y^T^'qTT 

i— 1 X 


(5.57) 


n q. i+aq. . n I’taq. 

and we have 6,^ = ?T 


(5.58) 
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For our purposes, the second terra in (5.53) is a constant. 
Therefore we rewrite 


n 


% 


®k = S Pi fin ^ - In 


i=l 


itaq^ 

i+ar. 


j - A 


(5.59) 


We can straight away conclude that 

giCl+ar,) 

max(ej^) = max fin ) } - A ( 5 . 60 ) 

and 

q^d+ari) 

min (e^)= min {In - A. 

We know that as a -* 0 (5.57) approaches Kullback-Leibler measure 

of directed divergence, (5.2). Now we shall show that as a •* O, 

our results here approach our results obtained in section 5.2. 

* n i+ar. 

It A = It I S In (t-tt—) 

a-*0 a^O ^ i=l x+aqi 


n itaq. (l+arj)q. - (l-faq. )r. 
S (r-rrr4) = — = — -1 = — - 


S ^-1 
a-0i=i ^ i 


(Itaq^) 


n 


S (qi*-r.) = 0- 
i-1 


It max(6T^) = max In (——) and It min(6i,) = min In (-2i) which 
a-*0 1 i a-*0 i i 

are same as our results obtained in 5.2. 


5.6 Second Problem using Generalized Measures of Directed 
Divergence 

5.6.1 Havrda-Charvats' Measure of Directed Divergence : 


Here we are concerned with finding the maximum and 
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miniroum of 


<P_= S X. d'^CPiQ.) 
o J J 


where Q. = (q../*»-/g. ), j = We have 


ak ^ ^ 4 

^ ^ j=l ^ 1=1 ^ 

, n m 

^ _i: ( E X. q^.^-1) 

“ 1=1 ^ j=l J 


( E 5.)^*^ ^ 

( 2 ( % q-Kx . 

a-l 1=1 

E q. 


(5-61) 


where 


ii = ( E Prom (5.6i) we find that 

^ T=1 -J 3^ 


a constant times multiplied Havrda-Charvats' directed divergence 

- =- - - -. ^- 
of P from 15 = ( where ^ = (^/ q^^- 

1 

Therefore we can easily see that the minimum value of is zero 
and the maximum is obtained as follows : 

Case (i) : Let a > 1# Then we have, because x -» x^"^ is a 


decreasing function, if q^ = min q^ 

(5p^-“ >5- V 1 = 


(5-62) 


-> C s Pi)qi.““ > s p't q: 

i=l i=i 


a l-a 


=> (qi”*^) > ( E P^)(q^“^) ^ _S P^qJ"^ because (E p^) < 1 
i=l i=l 


OC \ / =1 ^CL \ 


a=i-a 


=> 


CL =x —a ‘ - 
^ ^ ^ Pi ^i " 

i=i ^ ^ 


(5#63) 
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Now because (a-1) is greater than zero we have from (5*63) that 


0=1 " ^ fa=TJ 


(5*64) 


There from (5»61) and(5«64) we can easily observe that the max* 


value of (p^ is given by 


n 


i=l ^ 


a-1 


(5*65) 


and it occurs for P = Ij,. 

(ii) ; O < a < 1* Here we have (l-a) > 0 and therefore 

^ . 

X “*■ X IS an increasing function# Therefore we have, instead 
of (5.63) 


- 1 < ,S P^Cq^)^"® - i. 

1—1 

But now (a-1) < 0. Therefore from (5.66) we get 


(5.66) 


(5»67) 


Now from (5.61) and(5.67) we again gat the max. of as given 
in (5.65). 

/ y — \ 1 — a 

^ ^ r = 1“<X 1 

Therefore for all a, max = — - 1 J* 

We can easily see that our results of this subsection 
coincide with those in section 5.3 when a is made to approach 1. 
5.6.2 Ferraries Measure of Directed Divergence 
t ^ i^Pi , 

'rTTii7> > ° 
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Here (p 


_ 1 


m 


n 


1 + jup. 

M “ M jEj +MPi) In 


•ji 


1 n 1 ^ ni X" 

p- ^S^(l+WPi)ln(l+yp£)- - ^S^(l+jLipj_)ln( rr^ (i+jiiqj^) 


? • 

(5.63) 


m X. 

TT (i + juqj^) ^ 

Now we denote by ^ ^ then fq^} denotes a 

S TT (1 + Mq.j ) 

i=l 3=1 J 

probability distribution. Then from (5,53) we get 


„4.,^ ^ H*iup-. (l+jup,- )/(n+jLj) 

= —T C 


n+ju 


n m Xj 

E TT(i+jLtq. . ) 
i=i j=l 


] 


(5*69) 

Now we see from (5*69) that the second term of is a constant 

n „ ^ p4 „ 

and the first term can be represented as ( S p. In ^)(2I^, The 

1=1 q^ ^ 

minimum and maximum values of the first term of (5*69) arezero 

and In (~ ^ '■■ "• ') respectively* From this we deduce the minimum and 
'^in 


maximum values of <p to be i 


- (2^ In (-- — — ) and 

S T1 (1 + jnq..) 

1*1 j=i 


(2±ii) In (-^— ) - 




'^min 


In (-- 


n±iL 


ra 


■) 


E TT (1 t jLiq.. ) ^ 
i=l j=l 


respectively* 
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5*7 Geometric Interpretation 

In Euclidean space the locus of a point P which moves 
so that PA — PB = C where A and B are fixed points and C is a 
constant is a hyperbola provided C < AB* In the case of 
similar problem in probability space, 6 can be greater than 
D(Q:R)* 


Again in Euclidean space the locus of a point, the sum 
of weighted distances from which to m given points is constant 
is an m-ellipse which is a closed convex hyper-surface and this 
exists only if theconstant is greater than a certain critical 
value, but there is no upper limit to the valueof this constant# 

On the other hand in the corresponding problem in probability 
space, the corresponding constant has both upper and lower bounds# 
The Steiner problem in Euclidean space has in general no easy 
solution, while the corresponding Steiner problem in probability 
space has an elegant solution# 

One operational significance of the second problem can be 
as follows : m individuals whose relative importances are given 

by 

allotments to different aspects of some work in the form of 
Q^,Q 2 /«»*rQjji f then we can find an allotment which is closest 
to these allotments. 

With that we conclude this chapter# In the next chapter 
we shall consider some more optimization problems in information 


h 


f 0 0 S 


have given their assessed proportions of 


theory 



Chapter 6 


Some more Optimization Problems 

Introduction . Let Q= Cq 3 _,q 2 > . . and R = (r^^, r 2 / . . . / r^) 

oe any rwo given probability distributions with q. >0/ 

n n ^ 

r . > 0 ¥ i=l/2/,..,n and S q. = S r. = 1* 

i=l ^ i=l ^ 

Let d'^(Q:R) be the Kullback-Leiblers ’ measure of 

directed divergence. Then Kapur [35] considered the 

following optimization problems: 

( 1) Out of all probability distributions which are equally 
distant from Q and R find that distribution which is 
nearest to Q( or R) . 

(2) Out of all distributions which are at a distance a from 
Rf find that distribution which is closest to Q, 

( 3) Out of all distributions whose disteinces from Q and R 
are in the ratio 1 :k find that distribution which is 
closest to Q, 

(4) Out of all distributions whose sum of distances from 
Q ar:d R is b, find that distribution v/hich is closest 
to Q. 

(5) Find the distribution P from which the sum of directed 
divergences to k given distributions Q 2 C/ •«•••• 


minimum 
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(6) Out of all distributions from which the sm of weighted 
directed divergences to Qj, / Q 2 / • . , is constant, find 
the distribution closest to Q^, 

(7) Out of all distributions which are equally close to Q 
and Rf find the distribution •which is closest to another 
distribution S, 

Here the distance from P to Q means, as in the previous 
chapter, the Kullback"Leibler directed divergence of P from Q. 

Kapur [ 35 ] has solved the seven optimization problems 
and has also given solutions to their equivalents with Euclidean 
distances^ which are well known. Here in this chapter we deal 
with these problems using Havrda-Gharvats' measure 


d^’Cpiq) 


. 1 

1 


n 

2 

i=l 



1-a 

^i 


-1 


as the distance from P to 

We obtain the solutions to these problems by making use 

of Lagrange's method of multipliers for minimizing the distances.. 

1 2 

We consider the special cases of a = and ~ to simplify 
the expressions. In some cases we present numerical solutions 
to these problems with having the given distributions assigned 
some numerical values, 

6. 1 Probl em 1. Find a probability distribution P which is 
nearest to the distribution Q out of all distributions which 
are equidistant from the distributions Q and R, 
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Minimize: 


Dj^(P:Q) = S -IJ subject to _ S Pi=l/ _ S pf =0 

i*“l i®“l X"*! 


By applying Lagrange's multipliers method we get 

1 

1-a 1 -Ck 1 l-u 
— r. 

1 


Cqf“-X(qh“-rh“)} 


( 6 . 1 ) 


n 


1-0. , , 1-0 „l-a^ I l-u 


_ f X— u* . / x—yi^ 

E{q. -A(q. -r 
i=l ^ ^ 


n 


and k is to be eliminated using S p- (q^ ~ 

1 X x 


tt, l-» 1-0, ^ 


a 


or 


Now we 


s {q^ "^^qi ^qi “^i ) 0* 

i=l 

1 

shall first consider two special cases, (i) o = 


(6.2) 


(ii) ^ 

Case (i) ^ 


{ Vq.-X(Vq.-v/'r.) } ^ 

Then (6,1) becomes p. = — r and (6.2) becomes 

S£ Vq^-A ^ 


n 



/-I (. 

i=l 

^i 


or 

(i~S^ 


- 2 

if 

^i ^ 

% ^ 

i=l. 




1 

2 • 


(6, 3) 


Therefore Pj_ 


{Vq.-Vr.) 


n 

2 

i=l 


S (vTq^-vfr^) 


(6.4) 
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Case (ii) a = ^ . 

In this case/ (6. 2) reduces to the following q uadratic 
equation in X 




n 


2/3., 1/3 1/3, - 2/ 3. 1/3 


+ { S (r«^-2qy (q" 

1 = 1 ^ 11 1 


r7/^)} = 0 


(6.5) 


The solutions of equation (6, 5) are given by 


-.2 


i=l 




^ i S (r|/2-2qV3r?;/3) (ql/3_ j-V3)j 
i=l 


2 qj/^CqV^- rj/3) 

i=l 1 1 ^ 


( 6 . 6 ) 


We shall evaluate this case for a set of distributions Q 
and R and find P, 

Example 6.1.1 . For a = -^ if we take Q = (.7. .3) and 

R = (.4/ .6)^ we get the two solutions for (6.6) as 

X^= 1.5881 and X 2 = -0.6575/ and P^= (.2433/ .7567), 

P = (.3 491/ .1509). The corresponding distances are .29 
2 

and ,42 respectively. So the optimal solution is given 
by P^. 
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^•2 Problem 2. Out of all distributions that are at a distance 
a from R, find that distribution which is closest to Q. 


1 r ^ U 1— tt 1 ^ 

Minimize 2 P,-qj subject to (i) 2 p. = 1 

*“ 4 •>— *1 •£ .aia. *1 


i=l 


i-1 


n 


1=1 

By using Lagrange's multipliers method we obtain the 
minimizing distribution {p^l as follows: 


(6.7) 




Pi ' 


1=1 1 1 


( 6 . 8 ) 


where X is to be eliminated using (ii) above. The equation 
for obtaining X is as follows: 

1 a 

1 , ^ / 1-u ^ I'-Ov 1-u ^ / 1*« ^ n _ r, 

f S Cq^ -Xr. ”) ) I E (q^ -Xr ) -1} - 0 

(6.9) 


Now we shall consider two special cases* 
Case (i) i* = i 


(6,8) becomes p. 


(v''q.-X'^r. ) 


1 n 


and (6,9) becomes 


E (vTq -XVr.) 
i=l ^ ^ 


n 


n 


+4{ S (Vq -X’^''r.)'^r.-11 
i=l r -L J- 


a^i S ('/q.-X'/’r.j^-;} 


( 6 . 10 ) 


i=l 


which is equivalent to 
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n - n 

2X E '/"r. q. (a -1) +( 2 V r . q. -1) 
i=l ^ ^ i=l ^ ^ 


2 



So X= 


(a^-l) 2 V‘r.q. 2 ;'^(a 2 - 1 ) 2( s {( 2 

i = l ^ ^ i=:l ^ ^ ^ ^ 

( l-a^) 


One value of X here gives the optimal distribution 
Example 6,2,1 . ^ Q=(.7, .3) and R=(.4^ .6), We get the 

two solutions for (6.11) as X^=,66, -2.57 and P^=(.994,.005) , 

, 30 3) respectively. 

The corresponding distances D^'^^(P^;q) and D^'^^(P 2 *Q) 

— 1 — 4 

are 0,9 27 x 10 and 0,183 x 10 respectively. So the optimal 
solution is given by P 2 » 

Case (ii) ~ "f* 


Then we have from (6,9) 

-3J 2 (qV^-Xr^^) ^rf/3-l}=a{ 2 (qV ^-XrV ^) ^ ?^^ 

• X X ..iX X 


i=l 


i=l 


or -27 i S (qi/3-xry3)X,y3)2 S 1/3); 

• ^ JL Jl JL Ju « ^ ^ wi- 


i=l 


i=l 


or v/ith the notation that 


E gy3ry3=A and E qy3ry3=B we have 


i=l 


i=l 


-27[X^+3X'^(A-B-l)-r3x2 C(A+B) ^-2(A+B)} + {(A-B) ^-3(A+B) ^+3(A-B)-1}] 

= a^[x®-X^(6B) +3X^(3B-2A) -2X^(1-9AB) +3X^(3A^+2B) +X(6A) +1] 
or by collecting the coefficients of like povrers of X together. 


we get 
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A^(a^+27) -A^(6Ba^) +3A^ {a^( 3B-2A) •j'27(A-B-l)} -2A^(1-9AB) 

+ 3A^{3A^+2Br27 (A+B) ^-54(A+B)} +A(6A) +1+27 {(A-B) ^-3(A+B) ^ 

+ 3(A-B) - 1} = 0. (6.12) 


(6,12) is a 6th degree equation in A one of whose roots 
gives the required distribution {p^^} when substituted in (6.3) . 


6.3 Problem 3 . Out of all probability distributions whose 
distances from Q and R are in the ratio l:k find that distri- 
bution which is closest to Q. 

Solution 

iiinimize 



n 

2 


i=l 



1-a 


- 1 } 


Subject to , 
n 

(i) 2 p. = 1 and 

i=l ^ 


(ii) 


n 

ki 2 
i=l 


a i-a 


1 }- 


n 

i 2 

i=l 


a. i-a 


1 } 


0 


(6.13) 


Fly making uco of Lagrange's multipliers method we obtain the 


minimizing distribution as is given below: 


= { 


r 1-a , 1-a 1-a.-, 

[q. -A(kq^ -r^^ )] 


1 

1-a 


i=l 


(6.14) 
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Now we eliminate A from the above e:xpression by substituting 
it in ( 6 * 13 ) above, obtain the equation for solving X 

is follows: 


n 

2 

i==l 


, l-a i-a 
»X(kq^ 


g 


[ 


■' "» 

^ , l-a . 1-cc i-cc. ytcV 

2 {g. -X(kq. -r. 

±=1 ^ 11 




*’}+(l-k)=0 


(6. 15 ) 


Cass (i) u. — ™ 

Then ( 6 . 15 ) becomes 


S q,-kCk {k/q -Vr.} = (k-1) [ S {Vq -XCk/q -/r.) }2]V2 


n 


n 


or 


[ S {'^q. (^'^'^q-i -'^r . ) } {kvTq -vTr . } ] = (k- 1) ^ S {'^q. -X (k'/'q. 

i=l ^ 11 11 r 1 




n 

or if we denote by O = S'lf^r.q. we get the quadratic equation 

i=l ^ ^ 


X^{(k+i) ^+4ko(ak-k-l) -2(l-ka) (k-1) 
t2X {k(3k+l)-0(4ka+k+l}+(k-l) ^(k+o) } 

+ {k2-o(a-.2k)-(k-l) ^} = 0 (6.16) 


and it ' s solution is given by 

-I t II 

X 

III 

where I = -2 {k( 3k+l) -h(4ko+k+l) -.-(k-l) ^(k+h) } 


II ='^ill^-4 III {k^+o(o-2k)-(k-l) ^ ! 
and III == { (k+ 1 ) ^+ 4 k 0 (k -k-1) -2(l-k0) (k-1) 
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Smml o . 6.3, 1. For = I , Q = (.7, .3) and R = (.4, .6) 

we get the two solutions for (6.16) as K^= 11.00 7, 0.013 

and P^=(.9 268, *0732), P 2 '= (.696, ,304) respectively. 

The corresponding distances are D^/^(P^:Q) and 
■D '^^(P 2 *+) are 0,0927 and 0,183 x 10 So the optimal 
solution is given by P^. 


9 

Case (ii) 

Equation (6,15) becomes v;ith the notation that 

A^= (kqJ/^-rJ'^3) 

A^{(Sa|) ^-(1-k) 3(2 a|) 2}-6A^(Sa|) (Sqi'^^Al) {(1-k) ^()3 a|)-1} 

+ ( TpV ( SA?) ^+4 ( SA?) ( SAjqf/ ^) ^- 3 ( l-k)^ [ ( ^A?) ^ 

+ 2 ( Sa|) ( SqV ^A^)] } +2X^ £ ( Sqj/ ^) ( SaJ^ ^ ( SA|qj/ ^) - { ( SA|qV 2) ^ 

~ ( l-k) 2 [ ( 2qJ/ ^) 2 Sa|- 9 ( SqV ^ A?) ( Sq?/ ^Aj_)] } + 3X ^ £ ( Sq?'^ ^) ^ ( 2A?) 
+ 4 ( Sq?^ ^ A^) ( 2A?q^/ ^) - ( l-k) ^ [ ( Sq|^ ^A^) ( Sqj/ ^) ^ ( Sq^ ^ A ?) } 

~ ex £ ( Eq^ ^A^) ^( SA?qj/ ^ + ( l-k) ^ ( Eq^^ ^) ^ ( EqV ^Aj_)}+X° [SqJ^ ^A^ 

- (l-k) ^(EqV^) = 0. 


One oi the six roots of the above sixth degree equation in X 
gives the optimal distribution £Pj_} when siabstituted in (6,14), 

6.4 Problem 4. Out of all distributions whose sum of distances 


from Q and R is b, find that distribution which is closest Q, 
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Solution 


1 — 

Minimize { 2 p^q:" '~-l} 

^ i=l ^ ^ 

n 

Subject to (i) S p. = 1 
i=l ^ 


n 


a i-a 


n 


(6.17) 


j,pr‘or-r“)> 


= b' 


where b* = b+ ■— 2_ 
a-1 


By applying Lagrange's mxaltipliers method we obtain the 

minimizing distribution {p^} as given below 

1 

1 -a , , i-a, l-c 




n 


„ , 1-a . , 1-a i-a. , i-a 
2 {q. +r. )} 

i=l ^ ^ ^ 

and the equation to eliminate X from (6.8) is 


(6.18) 


Ur-1 


, ^ , 1-u. ^ f 1-a l-«fc , l-» , 1-a l-OL 

{_S £qj_ -X(q^ +r^ )5 (qj_ tr^^ “} 


i“l 


n 


i=l 

Mow we consider the special cases. 
1 


Case (i) * 


2 


i=l 


(6.19) 


Then we have from (6.19) 

-2{ E ('yq,-X('/qqWr.)) (vTq^+NAr^)} = t>' [_ S£ Vq^-\(/q^+Vr^) ^ 


Sftuaring both sides we get 
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n 

4 £ 2 C'/’ g. (/ q. +/r . ) ) (Vr . + /q. ) } 

i^l X XI XX 



2 


. S{'^qi-A(Vq^+>/'rj^) 


which is equivalent to the following quadratic equation in A, 
with 


A ^ ( 2A?) ( 42A|-b « - 2A ( 42A? ( 1+ S^^q^) -b» ^EA^ /q^) + ( 4 ( l+2/r^j_) ^-b ' ^) =0 . 

-b + '/b^-4ac _ 0 0 

X = ^ where a = (2A£) (42Ar-b‘‘^) 

b = -2{42A?(l+2’^^j^)-b' ^2A^'lq^ } and 
c = (4(l+2Vr^^) ^-b‘ ^ 


Example 6.4«1 . For «- = 1/2, Q = is 7, ,3) and R - ( . 4 , .6) 
we get the two solutions for (6,20) as A^ = 0,0457 and 

11,06 and P^= (0,713/ 0,287)/ R^ (0.545/ 0,456) respectively, 

1/2 1/2 

The corresponding distances D ' (P^^iQ) and D ' (p 2 ‘Q) 

— — 1 

0.229 X 10 and 0. 257 x 10 .So the optimal solution is given 
by P]_. 

Case (ii) 


Then (6.18) becomes 

l/ 3 _ 


■3 £2 


A(qJ/^+rJ/^)}^(qy^+r^/^)}=b‘ [2 £qy ^-A (qV }■ 


or 


- 27 £2 {qj/ A ( qV rV ^) } ^ ( q V rV ^) } ^ 

= b ' [2 £qy ^ ^) } ^ f 


which is equx valent to the following quartic equation in A 
with the notations 


‘“liico 



I') 

A^{(Ib|) ^-bJCS B?) ^} 

~ 6 A^(E b|) (Sqy^B|) {b^(S b|)- 1 } 

+ 3A^{(Sq?/^B^) (2 b3) \ 4(7^1) (SB|q^/3) 2 

-3b|[(SB^qj^/3) 2^2(SB?) (SB^q?'^^) ]} 

+ 2A^{(2q^/^) {ip|) (SB|q^/3)-(SB2ql/3j 3 
“ ^^[( ^) ^ Sb|- 9 ( Sqf/ ^b|) ( Sq?/ "j } 

+ 3A 2 { ( Sq2/ 3) 2 ( ^^3^ + 4 ^ 2^2/ 3) ^ sB|q^/ ^) 

- b 3 [( DqV 3 b^) 2^. ( 2 gl/ 3^ 3 ^ ^q V Sg 2^ j j 


- 6A { ( Eq^ ^B^) 2 ( SB |q^/ ^) +b ^ ( Sq^'^ ^ C Sq?'^ ^B^) } 

+ A° £Sq^ ^B^-b I ( SqV ^) ^ } = 0 (6.22) 


One of the six roots of the equation (6, 2 I) gives the optimal 
probability distribution p^ when substituted in (6,17). 

6,5 P robl em 5 - Find the distribution P from which the sum 
of distances to be given distributions 
Statement of the problem is rninimxm. 

Solution Minimize 


1 

(Or- 1) 


s 

j=l 


^ CL 1 

s CpV 

i=l '^ ■ 


-u. 


■IJ 


1) 


subject to 
n 


S 

i=l 


P 


i 


1 


(6.23) 



Set the Lagrangian 


1 / ^ ^ Cfr 1— tt ^ 

L = ( 2 S p.q,. -1)-X( 2 p.-l) 

-* _ •f jl 


tt-1 


j=l 1=1 


1=1 


Equating ^ to zero and solving for p. using (6.22), we get 

3 Pi 1 


X 


[ S q^ 7 


1 - Ki »-^1 


j = l 


■IJ 


(6.24) 


Substitution of (6,23) In (6,22) results In 




n k 


l - ttal - O . 


[ £ { 2 q£” 1 

i=l j=i 


(6.25) 


The measure of distance being a convex faction of both P 
and Q (5,24) yields the required optimal probability distri- 
bution, We now test it for two special cases. 


Case (1) Cfc 


1 

2 * 


Then we have p. 

•^1 


= -J^ 


k 

2 Vq -) 


n 

2 £ 2 > 
1=1 j=l 


2 




Let us further take n=2/ k=3 

Q (,2,,8)/ and Q2=(.4, ,6) 


then we have 


Pi 


(V-.’S -r^T s +C4) ^ 

( nT: 2+ Cat C5) ( /ret C7+ C) ^ 


(1.543) ^ 

(1.543) ^+(2.51) ^ 
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_ (^^.6+V,7^•^/‘.8) ^ 


(2.510) ^ 

(1.543) ^+(2.57) ^ 


^ .73 


The required distribution P is given by (.27, .73). 

We shall check it up against an arbitrarly chooser proba- 
bility distribution, say 


= (0.5, 0.5) 
3 l/ ? 

s d/^(p;q.) = 
j=i j 


1 

(X-l 


I I (p“qt“-l)= -2 £- 3 . 0 “ C 5 <-C 73 ^ 

j=l i=l ^ ^ 


V:T7 0+V‘.’^30-Hrr27'C4+C73^} 


= -2 X -0.014 = 0.028 

I D^/2(P,;Q.)= -2C-3+ Ci '<rr2+r5'C8+'C5'C3+C5C7'r'C5'C4+'Csr'S} 

j=l ^ ^ 

= -2{-3+'C5('/'72+'rr3+'C3+'0+'C4+'C6} 


Hence we have 

Cas e ( li) 


= -2 X (-0.077) 


s d^/^(p,;qj 
j-1 ^ 


2 

3 


= 0.154 

> s d^/^(p;^.). 
j=l ^ 


Then we have p. 


£ I 


2 3 2 

S i S qM3}3 

1=1 j=l 


"ij 


We shall evaluate Pj_ taking the same distributions Q^, Q 2 
and Q„ as in case (i) . 

D 
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( ( ( > 3) ( . 4) ^ 




+ (.4) ^+ ( . 6) 


^+(. 8 ) ^ 3 ) 3 


+ (.7) 


- 7.9896 

■" 7.9896+18.3814 


0 . 29 7 3 3 1 7 


a (.6) ^^^+ (» 7) ^^^+(.8) 

^ 2 ( , 2) 3)^^ ( . 4) ( . 6) . 7) 3+ ( . gj 1/ 3 


18.8814 

7.9896+18.8814 


0.70 2668 3 


So the required distribution is P = (0,29/^0,71). 


Now 


S D ^^^( PJQ .) 
j = l ^ 


- |{-3+(.29) ^^^(.2) ^/^+(.71) ^/^(.8) 

+ {. 29) ^/^(.3) ^/V{.71) ^-^^(.7) 

+ (.29) 2/^(.4) ^/^+(.71) ^'^^(.e) } 

- “{-3+(. 29) ^/^(. 2^'^^+.3^'^^+ .4^^^) 
+ (.71) 


= -1, 5{-3-*'2.98 } = 0.0293 

and \ D^'^^(P,fQ.)= - 1 . 5 {- 3 +( 0 . 5) 2^^(1.98533+2,63689)} 
j -1 ^ ^ 


= 0.112 


Again 


\ .Q ) 

j=l ^ 


> D,/3(PJQj). 
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P JTObl ern ; 6 Out of all distributions from which the sum of 
weighted distances to is a constant find the distri- 
bution that is closest to 


Solution Idininiize 


n 


1=1 


subject to (i) 


a 1 
Pi^i 



n 

E p.= 1 
i=l ^ 


(ii) 


1 

Or-l 


m 

2 

j = l 


, ^ iX i— 

X . ( S p . q. . 

1 i=l ^ 


a 


-1) = T 


(6.26) 


By applying Lagrange's mxiltipliers method, we get the minimizing 


distribution {Pj_} as given below: 


I I-®- 7. 

^9lo 


Pi ” 


^ l-Ck 


1 

i-tt 


^ , l-C. ^ 1-€X 

2 (QL: „ “A Sx q.. ) 
1=1 ]c=lk^J 


1 

TIE 


(6.27) 


Now A is to be detemuned from the following equation: 


m 




AH A-A a- ^ -t 

_1_ EX. ( S ~2 

Cfr -1 j=i J n 


m 


, / 1 - 1 ^ ' . l-stv 1-0- 

{ E (q. -A S X q . )} 

i=l ^o k=l ^ 


1-0- ... 
— qy -1) 


= T 


m n . m 

1 r „ ^ ^ . X —** 


O. 


<X-1 ^ 

J-1 1 = 1 O K=1 J 


)-“) i (qr„“-A ^ qt:“) 




i=l 3=1^ 

(6.28) 


where T' = 1' + . 



Now we shall consider the special case:/ a = *1 

4 C 1 

Then (6,27) becomes 


m n 


or 


-2[ 2 X -A 2 vTq .) q -] = T* £ 2 i^q..-A 2 '^q...) ^ 

j=l J i=i 10 j=i IJ IJ IJ 

(6. 29) 

mn m ^ ■? 

4[ 2 X. ( 2 ('7q -a 2 ^^q. .) ]^ = T« ^ s (sfq -a S Vq. •) 
j=l ^ ±2=1 i = l i=l i=l 


m n m 


A^£4( 2 2 2 X.N^ q.O 2 -t« 2( s E Vq^i; .) ^ } 

j = l i=:l ]^=:1 J -^3^ 3 = 1 


.n m 


m n 


-2A£( 2 2 X.v/q. qr.) (2 2 2 X.N/‘q.,q. .) x 4 

j=l i==i J j i k J 


^ n n m m n 

i=l i=l j=l j=l i=l 


n 


-T* 2( s vTqT ) ^} = 0 
i=l 


(6.30) 


(6.29) gives two values of A one of which will gives as the 
required optimal distribution p. when substituted in (6,26) 


A = 


-bi'^b^- 4 ac 


2a 


where a = 4(2 2 2 X^/q.^q^ ^-T’ ^(S ^ 


r '-jk-il^^iJ 


t 3 


b = 


2 {(2 2 2 2 X. /qr^^qij) 4 


n 


-T'2( S ^IPCS S'^cilj'Jio' J 

i. *** X ^ J 


o o ^ ^ 2 

and c = £4(2 2 X q. .) —T* ( 2 }, 

i j £=1 


^s>IM 



Problem 7 < Out of all distributions which are equally 
close to Q and R find that distn, which is closest to another 
distribution S, 

1 ^ Cfc 1— tt 

Solution : Minimize -1} siabject to 


(i) 2 p. = 1 

i=l ^ 


1 f „ o-/ 1-a l-Osi 

3^ ' -r^ni=0 


(6.31) 


By applying Lagrange' s method/ we get the optimal distri- 
bution {Pj^} is given by 




Pi “ 


="1 1 


l-£fc . , 1-Ci- 1-0. * 1-0 

. -X(q. -r. 


(6. 32) 


S {s^ -\(q. -rr 

i=l ^ ^ ^ 


Now A is to be eliminated by using (6,31) and (6,30) : we get 
after the substitution 


, 1-0 . , 1-0 1 - 0 , , 1-0 
n {s. -X(q. -r. )} 

{ _ S — 

* Jl JL JLi 

1=1 


(qi-“-ri-“)3 = 0 (6.33) 


ic (i) Let O = - Then (6,3 2) becomes 


2{^s.-\ (V^^-'Tr^)} = 0 

i=l 


or SN/'g. s.-S'/‘r.s.-2X(l+S'^. q.) = 0 

•^x 1 XX XX 


(6.34) 
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2 (Vq. -Vr. ) 

From (6,33)/ A = which when svibstituted in 

2(l+2Vr^q^) 

(6.31) gives the required minimizing distribution 
2 

Case (ii) & = Then we have from (6,3 2) 


n 

2 

i=l 



0 


Let 




then the above equation becomes 


2 {s|'^^+A\?-2Asy^A? }= 0 

i=l ^ 111 

or the quadratic equation in A: 

a2( 2 A?)-2AC2 sy^jJ) + ( 2 s^^A.) = 0 
1=1 1=1 


whose solutions are given by 

_ 2 s y ^A? + '</■ ( 2 Si^A?) ^- ( 2 a|) ( 2s y ^A^) 

X = 

( 2 Af) 
i=l 

One of these roots will give the optimal distribution {Pj_} 
when siibstituted in (6,31) . 

'With that we come to the end of this chapter. 
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